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THE RELATION OF SECONDARY REINFORCE 
MENT TO DELAYED REWARD IN VISUAL 
DISCRIMINATION LEARNING! 


BY G. ROBERT GRICE? 


Psychological Laboratory, State University of Lowa 


INTRODUCTION 


Delayed reward is a problem of central importance for reintorce- 
ment theories of learning. In his goal gradient hypothesis, Hull (4) 
postulated that the strength of the association formed between a re- 
sponse and its accompanying stimuli is inversely related to the length 
of the delay by which the reward follows the response. ‘The gradient 
was assumed to be logarithmic in form. ‘This hypothesis was sup- 
ported by the delayed reward experiments of Hamilton (3) and Wolfe 
(11), both of which showed learning to be a decreasing function of 
the delay of reward. ‘These studies indicated also that the gradient 
extended for a considerable period of time following the response. 
Subsequently, Hull (5) pointed out the possibility that the learning 
after long delays in the Wolfe and Hamilton experiments might have 
been the result of immediate secondary reinforcement. In both of 
these maze learning experiments the Ss were detained in delay com- 
partments for a period of time prior to entering the goal box. It 
was suggested that the delay boxes, being followed by reward, might 
have become secondary reinforcing agents. 

In an attempt to minimize secondary reinforcement immediately 
following the response to be learned, Perin (6, '7) employed a Skinner- 
type box in which the lever pressing response, the delay, and the re- 
ward all occurred in the same compartment. ‘The results of these 
experiments indicated that learning under such conditions was im- 


1 A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor 
of Philosophy in the Department of Psychology in the Graduate College of the State University 
of Iowa. The writer is indebted to Professor Kenneth W. Spence who directed the investigation. 

* Now at the University of Illinois. 
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possible with delays of about 30 sec. or longer. On the basis of these 
findings, Hull (§) postulated a short primary delay of reinforcement 
yradient of about 30 sec. The longer goal gradient was then derived 
from this by assuming the development of secondary reinforcement 
within the 30-sec. period, and the gradual moving forward of such 
secondary reinforcing property in the stimulus response sequence. 

further evidence of the importance of secondary reinforcement in 
delayed reward learning is provided in an experiment by Perkins (8). 
Perkins employed a covered T-maze of which both the top and bottom 
were opal-flashed glass, thus eliminating all extra-maze cues. The 
possibility of secondary reinforcement by the two different delay 
boxes was eliminated by interchanging the two delay compartments 
so that each was followed by reward half of the time. When a group 
for which the delay compartments were interchanged in this manner 
was trained with 45 sec. delay, it was found to learn significantly more 
slowly than another 45 sec. delay group for which the same delay box 
was always followed by food. ‘This difference in rate of learning was 
attributed by Perkins to the action of secondary reinforcement in 
the case in which the delay compartments were not interchanged, 
and to the elimination of differential secondary reinforcement by the 
shifting procedure. Perkins then went on to study learning in the 
T-maze as a function of delay with the delay compartments inter- 
changed. He obtained a function which dropped more sharply than 
that obtained by Wolfe, and which showed only a barely significant 
amount of learning with 120 sec. delay. 

Spence (10), in a recent analysis of this problem, has suggested 
that even the shortened gradients obtained by Perin and Perkins may 
be the result of immediate secondary reinforcement. His suggestion 
is that while these experiments may have succeeded in eliminating 
differential secondary reinforcing stimuli from the external environ- 
ment, such stimuli may have existed within the animal. Thus the 
particular pattern of proprioceptive stimulation following the cor- 
rect response would presumably persist within the organism for 
a short period, and might still be effective at the time of reward. 
In such an event, the proprioceptive pattern of stimulation coinci- 
dent with the reward would acquire secondary reinforcing properties 
through its association with the immediate food reinforcement. On 
subsequent occurrences of the response, the proprioceptive stimuli 
resulting from the act, being similar to the proprioceptive traces per- 
sisting until the moment of reinforcement, could, through generaliza- 
tion, provide immediate secondary reinforcement. The length of 
delay during which learning could occur, would depend then upon 
the length of time that the changing proprioceptive stimulus trace 
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we 


remained sufficiently similar to that at the time of the response to 
permit generalization. 

Under such an interpretation there is no need for the assumption 
of a primary delay of reinforcement gradient, as all instances of learn- 
ing under delayed reward conditions would be accounted for by im- 
mediate secondary reinforcement. ‘This formulation relieves the 
learning theorist of the embarrassing problem of explaining how a 
reward can work backwards to strengthen a stimulus-response as- 
sociation, when the response was made some time earlier. 

One possible experimental test of this interpretation is to introduce 
delay of reward in a non-spatial type of discrimination learning 
problem, e.g., visual discrimination learning. In spatial discrimi- 
nation problems such as the T-mazes used by Wolfe and Perkins, the 
S is forced to make different spatial responses, turning right or left, 
which provide very different proprioceptive stimulation. Since one 
of these responses is consistently followed by reward, and the other is 
never reinforced, differential proprioceptive secondary reinforcement 
may be built up so long as the proprioceptive traces of the different 
acts are discriminably different at the moment of food reward. How- 
ever, in the non-spatial visual discrimination situation, it is possible 
to eliminate such immediate secondary reinforcement based on dif- 
ferential proprioceptive stimulation. Since the positions of the posi- 
tive and negative stimuli are shifted irregularly from left to right, each 
motor response of turning left or right is correct half of the time and 
incorrect half of the time. This condition means that neither pattern 
of proprioceptive stimulation acquires greater secondary reinforcing 
strength than the other. The remaining possibility of immediate 
secondary reinforcement in this type of learning problem is that the 
stimulus traces of the visual stimuli provide the basis of the differ- 
ential stimulation. So long as a trace of the positive stimulus is 
effective at the time of reward, this trace may become a secondary 
reinforcing agent, and the positive stimulus itself might then come 
to provide immediate secondary reinforcement. The limit within 
which delayed reward learning could take place in a visual discrimi- 
nation problem would depend then on the time that the after-effects 
(stimulus traces) of the positive and negative stimuli remain discrim- 
inably different. 

A further implication of the above analysis for visual discrimi- 
nation learning is that it should be possible to improve learning with 
delayed reward by experimentally introducing immediate secondary 
reinforcement. For example, if the positive stimulus itself, or stimuli 
within the range of its generalization gradient, were present at the 
time of reward, this stimulus should acquire secondary reinforcing 








properties, and would then provide immediate reinforcement, thus 
increasing the speed of learning, and lengthening the delay with which 
learning might occur. Furthermore, if the Ss were forced to respond 
with chatacteristically different motor patterns to the different choice 
stimult, the proprioceptive traces of the response to the correct sti- 
mulus should, within limits, acquire secondary reinforcement proper- 
ties, and extend the delay of reinforcement gradient. 


Several investigators have employed delayed reward in visual discrimin- 


ation situations. Wood (12) using chicks in a brightness discrimination 
found learning to be a decreasing function of delayed reward up to five 
min. However, the chicks were delayed only following correct choices. 


following errors they received immediate electric shock. Thus, the avoi- 


dance of shock on correct trials may have provided immediate reinforce- 
ment, and the experiment is not an uncomplicated delayed reward situation. 
Wolfe (11), in a black-white discrimination experiment with rats, found no 
learning with delayed reward when the delay followed both correct and in- 
correct responses. Hlowever, when the door was blocked providing im- 
mediate frustration of wrong responses, results were obtained similar to those 
of his T-maze experiment. Like the Wood experiment, this one is also prob- 
ably not a genuine delayed reward situation. The delay compartment, 
which after the delay always leads immediatley to food, would acquire 
secondary reinforcing properties. Under Wolfe’s procedure, only correct 
responses are followed by entrance into the delay compartment. Thus, 
immediate secondary reinforcement is provided for correct responses but 
not for incorrect responses. Wolfe’s incidental finding, that learning is 
greatly retarded when delays follow choices of both stimuli, has been veri- 
fied by two other investigators. Riesen (9), using a red-green discrimi- 
nation with chimpanzees, found greatly retarded learning with one- and 
two-sec. delays, and one of two Ss failed to learn with four sec. delay. There 
was no evidence of learning with eight sec. delay. However, several spe- 
cially trained Ss, trained to respond differentially to the stimuli, were able 
to learn with eight sec. delay. Gulde (2), in a study with rats in a black- 
white discrimination, found no evidence of learning in 200 trials with five 
sec. delay. 


In the present experiment, the learning of a black-white discrimi- 
nation problem was studied as a function of the time of the delay of 
reward, in order to ascertain the limit and form of this relationship. 
The function presumably depends on the stimulus traces of the black 
and white stimuli. Second, the effect of introducing immediate 
secondary reinforcement was studied. This was accomplished first 
by the black and white goal boxes, so that following a choice of either 
black or white, the animal, after the delay, always entered a goal box 
of the same color as the stimulus chosen. The final experiment was 
to force the animal to make characteristically different motor adjust- 
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ments to the black and white stimuli. Learning in this situation was 
then compared with that in a problem of equal delay of reward in 
which no such characteristically different motor reponses were made. 


SUBJECTS AND APPARATUS 


The Ss were 75 experimentally naive, female, albino rats from the colony maintained by the 
department of psychology of the State University of lowa. Their ages ranged from 80 to 11 
days at the beginning of the experiment. They were assigned at random to the experimental 
groups. 
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Fic. 1. Ground plan of the experimental apparatus. Doors are represented by heavy dotted 
lines at the points D. Curtains are represented by the light dotted line at C. 


The apparatus consisted of a black-white discrimination box. The ground plan is shown 
in Fig. 1. The rat was placed in the starting box from which it passed through a two-in.-wide 
alley into the choice chamber. From this point it could enter either a black or a white painted 
alley. The floor of the half of the choice chamber leading to the black alley was painted black 
and the floor of the half leading to the white alley was white. In each alley two in. from the 
entrance there were black or white curtains the same color as the alley. The section of the 
apparatus which made up the black and white alleys and the floor of the choice chamber con- 
sisted of three identical alleys with the two outer ones white and the middle one black. By 
sliding this section back and forth, the black and white alleys could be shifted from right to left. 
After passing through either the black or white stimulus alley, the animal entered a neutral gray 
alley which was used as a delay compartment, and could be varied in length from 18 to 72 in. 
The goal boxes were continuations of these alleys and were 15 in. long. All alleys except the 
narrow starting alley were four in. wide and four in. high. With the exception of the black and 
white alleys, the entire apparatus was painted a neutral gray. The choice chamber and the 
stimulus alleys were covered with clear glass and all others were covered with hardwarecloth. 
Vertical sliding doors, which prevented retracing, were located at the entrances to the stimuli 
alleys, at the beginning of the delay compartments and at the entrances to the goal boxes. ‘The 
doors were operated by £ from behind a one-way vision screen at the starting end of the apparatus. 

The lighting was indirect from two shaded 200-watt bulbs. The brightness of the floor of 
the white alley just in front of the curtain was 1.515 apparent foot candles. The bri 
the floor of the black alley at the same point was 0.071 apparent foot candles. 

In order to force the animals to make characteristically different motor responses to the 
black and white alleys, different obstacles could be placed in them. One of these sets of obstacles 


) 


consisted of a 15-degree incline, nine in. long, which began one in. beyond the curtain. The other 
consisted of two blocks 2} by five in. and the same height as the alley. One block was placed on 
the left side of the alley one in. beyond the curtain. ‘The second was placed on the right side of 
the alley two in. beyond the first. This forced the animal, after passing the curtain, to pass 
through a five-in. section of alley 13 in. wide, make a sharp jog to the left, and continue through 
another narrow five-in. section before entering the delay compartment. Both the blocks and the 
inclines were available in black and white, so that the blocks could be in the white alley and the 
incline in the black, or the reverse. Removable black and white goal boxes were made of quarter- 
in. plywood. These boxes fitted inside of the gray goal boxes and could be shifted from the left 


to the right side. 
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PRELIMINARY [RAINING 


a 


All animals were adapted to a 24-hour feeding schedule for at least a week prior to prelimi- 


lary training. ‘They were fed in individual cages and received eight gm. of Purina Laboratory 
Chow daily. This diet and feeding procedure were the same throughout the experiment. During 


the experiment they were fed immediately after the daily runs. 


For the preliminary training, neutral gray alleys were substituted for the black and white 
timulus alleys. On the first day the rats were placed in the goal box and allowed to eat ten 0.3 
ym. pellets of Purina Chow. They remained in the goal box for 15 min. The animals which 
were assigned to the zero delay group were fed in the removable gray alleys, and all others were 
fed in the regular gray goal boxes at the end of the 18-in. delay compartment. Next, the animals 
were placed in the gray alley between the curtain and the door at the entrance to the alley and 
were allowed to run through the alley and the delay compartment to the goal box where they 
received one .15 gm. pellet of food, which was the standard reward throughout the experiment. 
‘There were four such runs, two on the left side and two on the right. The zero-delay animals 
received food in the removable alley and did not run through the delay compartment. On the 
second day there were four more such runs. On the third and fourth days the animals were 
placed in the starting box and allowed to run through the gray alleys to food. They received 
1o runs each day, forced half to the right and half to the left in a random order. The forcing 
was accomplished by closing one of the doors at the choice point. The purpose of these 20 forced 
trials was to help equalize position habits. The animals in the differential response group 
received their forced runs with gray blocks and a gray incline, otherwise identical to the black 
and white ones, placed in the gray alleys. Runs through the blocks and over the incline were 
divided equally between left and right. 


IXPERIMENTAL PROCEDURE 
Gradient Experiment: 

In the experiment proper, all animals were trained to go to the white alley. All animals 
had either no initial color preference or a preference for the black. Three animals with initial 
white preferences were eliminated. ‘There were 10 free choice trials per day for the first 200 
trials and 20 trials per day after that. The white alley was alternated from left to right in the 
order RLRRLLRLLRLRLLRRLRRL. All trials were separated by at least two min. Animals 
were run until they reached the criterion of learning, which was 18 out of 20 trials correct, with 
the last 10 perfect. One animal was discontinued after failing to learn in 700 trials and four 
after failing to learn in 1440 trials. 

Groups of animals were run under six different delay of reward conditions. The zero delay 
group received food immediately in the white alley. The 0.5 sec. delay group was allowed to 
run through the 18-in. delay compartment with the doors open and was rewarded in the goal box 
following choices of the white alley. The 1.2 sec. delay group ran through a 36-in. delay alley 
with the doors open and the 2 sec. delay group ran through a 72-in. alley. The times 0.5, 1.2 
and 2 sec. were determined by timing the animals on each run. The time measured was that 
from the leaving of the black or white alley to the entrance into the goal box. These times are 
the means for each group, of the median times for each animal in the group. The median times 
ranged from 0.4 to 0.6 sec. for the 1.5 sec. group, from 1.1 to 1.3 sec. for the 1.2 sec. group, and 
from 1.5 to 2.4 sec. for the 2.0 sec. group. The distributions of delay times for individual animals 
were all positively skewed in the manner typical of such time data. The mean semi-interquartile 
ranges of these distributions were 0.1 sec. for the 0.5 sec. group, 0.2 sec. for the 1.2 sec. group 
and 0.2 sec. for the 2.0 sec. group. The five-sec. delay group was delayed for five sec. in the 
18-in. delay compartment by leaving the goal box door closed for five sec. after the animal entered 
the delay compartment. The delay for the 10 sec. group was accomplished in a similar manner. 
In all groups the delay was the same following correct and incorrect choices. Following all 
choices of the white, the rat received the pellet of food in a glass cup placed at the end of the goal 
box and was allowed to eat in the goal box. Following choices of the black there was no food or 
cup in the goal box, and the animal was allowed to remain in the box approximately the same 
amount of time as in the case of correct choices.2 There were 10 animals in each group except 
in the 10 sec. delay group in which there were only five. 





The use of no food cup, rather than an empty one, in the goal box following incorrect 
responses reduces the possibility of secondary reinforcement following wrong responses, and 
probably, in part, accounts for the unusually rapid learning of the zero delay group. 








SECONDARY REINFORCEMENT TO DELAYED REWARD 


Secondary Reinforcement Groups: 

One group of 10 rats was run under the same conditions as the regular five sec. group except 
that the goal box following choices of the white alley was white with a white food cup, and the box 
following choices of the black was black with no reward. The delay compartments were gray 
as in the regular five sec. group. The black and white goal boxes were placed inside the regular 
gray ones and were shifted from left to right to correspond with the stimulus alleys. 

Another group of 10 rats was run under the five sec. delay conditions, with the blocks in one 
color alley and the incline in the other. Half of the animals had the incline in the white alley 
with the blocks in the black, and half had the reverse arrangement. White was correct in both 
cases, and both the delay compartments and the goal boxes were gray as in the original five sec. 
delay group. 


RESULTS 
Gradient Experiment: 
The number of trials required by each S to reach the criterion of 
learning is shown in Table I. The median number of trials for the 


TABLE | 


NUMBER OF TRIALS REQUIRED BY FacH ANIMAL TO REACH THE CRITERION, AND THE 
MepbIAN NuMBER FOR EaAcu EXPERIMENTAL Group 




















5 sec. . 
. Black and 5 sec. 
— ° 0.5 1.2 2 5 10 White | Differential 
y Goal | Responses 
| Boxes 
a a a a = ‘i 
20 | 100 | 220 | 300 | 320 84 s 320 
20 | 0 | 160 | 150 | 350 | 850 180 38c 
10 | 60 | 160 | 190 250 | 1440+ | 160 | 600 
20 | 140 280 | 440 560 | 1440+ | 130 270 
20 | 9go0 230 | 360 7OO+ | 1440+ | 130 210 
| | 
10 40 140 370 350 | | 140 | 390 
10 100 250 260 600 | 150 350 
10 go 230 | 440 650 200 | 170 
30 130 180 240 14407 | 270 140 
20 | 80 170 | 280 1260 180 270 
° } | | 
Median | 20 95 | 200 | 290 | 580 | (1440+) 155 295 








zero delay group was 20; for 0.5 sec., 95 trials; for 1.2 sec., 200 trials; 
for 2 sec., 290 trials, and for 5 sec., 58o0trials. In the five sec. group 
one animal was discontinued after failure to learn in 700 trials and 
another after 1440 trials. No median is available for the 10 sec. 
delay group since three of the five Ss failed to learn in 1440 trials. 
It was not possible to apply the t-test of statistical significance to the 
differences between the groups because the assumptions of normality 
and homogeneous variance were not fulfilled, and because there were 
some indeterminate values in cases where animals failed to learn. 
However, a test proposed by Festinger (1) which makes no asump- 
tions as to the form of the distributions was applied. All differences 
between adjacent groups were significant at the five percent level of 
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confidence or higher. 


The results of the test for the comparisons 


made with this test are shown in Table II. 


RESULTS OF THE 


TABLE I] 


FEeSTINGER Test OF THE SIGNIFICANCE OF THE DIFFERENCES BETWEEN THE 


Croups IN THE NUMBER OF TRIALS REQUIRED TO REACH THE CRITERION OF LEARNING 


The P-values are the levels of confidence at which the hypothesis may be rejected that 
the two groups compared were drawn from the same population. 
Groups Compared d P 
Ovs. .5 sec. 5-00 .O1 
O Vs. 1.2 sec. 5.00 Ol 

5 vs. 1.2 sec. 4.05 Ol 
.5 VS. 2.0 sec. 5-00 .O1 
1.2 VS. 2.0 sec. 3.25 105 
1.2 VS. §.0 Sec. 4.20 -Ol 
2.0 VS. §.0 8€C. 3-20 O5 
5s sec. vs. B & W Boxes 4.90 Ol 
5 sec. vs. different responses 3.00 05 


Learning curves for the six groups for 700 trials are shown in Fig. 
2. These curves represent the percent of correct choices for succes- 


sive blocks of 20 trials. 


Each S is assumed to continue at the 100 
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lic. 2. Learning curves for each of the six different delay groups 


‘The test of statistical significance proposed by Festinger (1) is based on a pooled ranking 


of the measures in the two groups to be compared. 
probability about the sum of ranks of one group. 


It is then possible to make statements of 
The test yields a statistic, ‘d,’ which may be 


referred to tables giving the values of ‘d’ for various N’s required for significance at the one 


and five percent levels of confidence. 


Strictly speaking, the hypothesis tested is that the two 


groups of measures are samples drawn from the same population, but the test is believed by 





Festinger to be most sensitive to differences between means. 
are that the two samples are independent and drawn at random. 


1() Cc DEBE eT fot 


The only assumptions involved 
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percent level after reaching the criterion. ‘Tests with several animals 
showed this to be the case with only very slight deviation. 

Gradient curves to show learning as a function of delay are shown 
in Figs. 3 and 4. In Fig. 3 the reciprocal of the number of trials re- 
quired by each group to reach the level of 75 percent correct choices 
is plotted against the time of the delay of reward.® Fig. 4 shows a 
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Fic. 3. Rate of learning as a function of delay of reward. The reciprocal X 1000 of the 
number of trials to reach the level of 75 percent correct choices is plotted against the time of 
delay. Experimental values are represented by black dots and the smooth curve is fitted to 
these data. 


gradient based on the percent of correct choices for each group during 
trials 141-180. ‘The sigma value, based on the percent correct, is 
assumed to be a measure of the difference in habit strength between 


5 Mathematically this function may be represented by a hyperbola of the reciprocal type. 


The equation for the curve fitted to the data of Fig. 3 is: 


I 
023 + 147" 
where R is the reciprocal X 1000 of the number of trials to reach 75 percent correct, and 7 is the 
time in sec. of the delay of the reward. 
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Fic. 4. Sigma values based on the percent of correct responses during 
trials 141-180 for each delay group 


the correct and incorrect responses.® ‘The fact that the 10 sec. group 
is below zero reflects the fact that this group had, at this stage of learn- 
ing, not yet overcome a slight initial preference for the black alley. 


Secondary Reinforcement Experiments: 


The two groups of concern in this portion of the experiment are 
the five sec. delay group with the black and white goal boxes, and the 
five sec. delay group in which the animals were forced to make dif- 
ferent responses to the two stimulus alleys. These groups may be 
compared to the five sec. group in the gradient experiment, where no 
differential secondary reinforcement was present. ‘The numbers of 
trials required by each animal to reach the criterion are shown in 
Table I. ‘The medians were 155 for the black-white goal box group 
and 295 for the differential response group as contrasted with 580 


6'This measure, suggested by Hull (5), is based on the assumption that the excitatory 
strengths of the two competing responses oscillate from moment to moment according to the 
normal probability function. Consequently, their difference would also oscillate in this manner. 
The result is that, when the tendencies are equal, there is a 50 percent choice of each response, 
and as the difference between them increases, the percent of choice of the stronger increases until 
the ranges of oscillation no longer overlap, and one response is chosen 100 percent of the time. 
Any percent of occurrence of one response may be converted into an amount of difference value 
by means of the normal integral table. This gives a standard score representing the difference 
between the excitatory (habit) strengths of the competing responses. 
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trials for the five sec. control group. Again the t-test of statistical 
significance could not be applied. According to the test used above, 
the difference between the black-white goal box group and the five 
sec. control group is significant at the one percent level of confidence, 
and the difference between the differential response group and the 
control group is significant at the five percent level. 

The learning curves for the three groups are shown in Fig. 5. 
They are plotted as percent correct for blocks of 20 trials. ‘The rate 
of learning for the black-white goal box group is clearly much more 


rapid than the other two. The difference between the differential 
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Fic. 5. Learning curves for the three different groups with five sec. delay 


response group and the control group was tested by applying the 
t-test for related measures to the differences between blocks of 20 
trials for the 300 trials from 141 to 440. ‘The mean number correct 
in each block of 20 trials was obtained for each group. Since the 
differences between these means approximated a normal distribution 
the t-test was appropriate. <A t of 3.61 was obtained, which for 14 
degrees of freedom means that the hypothesis that the mean difference 
is zero may be rejected at the one percent level of confidence. This 
means that the two curves differ significantly, and that the differential 


response group did learn at a significantly faster rate than the rats 
in the control condition. 


DiscussION OF RESULTS 


One fact of primary interest in the above data is the steepness 
and short duration of the obtained delay of reinforcement gradient. 
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[t should be pointed out that the discrimination problem itself was 
a very easy one, as shown by the unusually low median number of 
20 trials required to learn the problem with no delay. It is significant 
that a delay of even one-half sec. required almost five times that num- 
ber of trials, and that at five sec., the median trials to learn increased 
to 580 and two animals failed to learn in 700 and 1440 trials respect- 
ively. ‘Ten sec. delay was apparently beyond the limit under which 
learning was possible for three of the five Ss, and the other two were 
able to learn only after about 850 trials. ‘The discrepancy between 
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DELAY IN SECONDS 


Fic. 6. Sigma values from blocks of trials for the various delay groups in the Wolfe, Perkins 
and present experiments. The blocks of trials included for each of the three experiments are 
as follows: Wolfe—trials 7, 8, 9, and 10; Perkins—first 36 trials; present experiment—trials 
141-180. 


the results of this study and those of Wolfe (11) and Perkins (8) is 
shown strikingly in Fig. 6, in which sigma values similar to those of 
Fig. 4 are plotted against length of delay. Perkins’ experiment dif- 
fered from Wolfe’s in that differential secondary reinforcement in the 
delay boxes was eliminated by alternating the two delay boxes in a 
random order and by rotating the maze inthe room. However, there 
was the possibility of secondary reinforcement based on the reinforce- 
ment of the proprioceptive trace of the consistently correct turning 
response. However, this possibility was eliminated in the present 
experiment, since there was no motor response pattern which was 
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always correct. The difference between Perkins’ results and those of 
this experiment may be accounted for by the presence of such sec- 
ondary reinforcement in the one but not in the other. 

The gradient obtained here is also steeper and shorter than that 
obtained by Perin (7). The results of the two experiments are com- 
pared in Fig. 7. Both of the gradient curves are based on the slopes 
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Fic. 7. Rate of learning as a function of time of delay in the Perin and present experiments. 
The slopes of the group learning curves are plotted against the time of delay of reward. The 
Perin experimental data are represented by black squares and the present data by black circles. 
The smooth curves are fitted to the empirical data. 


of group learning curves which were plotted as percent of correct 
trials. Perin’s learning curves ranged from zero to 100 percent cor- 
rect responses. ‘The slopes plotted for his experiment are the slopes 
of the tangents to the fitted learning curves at the 50 percent point. 
The learning curves of the present experiment range from 50 to 100 
percent correct. Since the middle portions of these curves were 
approximately linear, the slopes plotted in the gradient curve are the 
slopes of straight lines fitted to the portions of the learning curves 
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between 60 and go percent. ‘These two gradients then represent the 
slopes of the learning curves (the rate of learning) of both experi- 
ments at the point where learning was half completed.’ Perin’s 
results indicate that learning may be obtained in his situation up to 
about 30 or 35 sec. delay, and the present data suggest that the limit 
is about 10 sec. or possibly less for most Ss. Again the discrepancy 
may be accounted for by the possibility that Perin’s rats obtained 
immediate secondary reinforcement from the proprioceptive cues 
following the act. A fact of interest is that in Perin’s second experi- 
ment (7), involving a differentiation between two bar-pressing habits, 
no learning was obtained with delays of only two and five sec. if the 
bar was removed immediately following both correct and incorrect 
responses. It was only when the bar remained in place following the 
incorrect response, but was removed immediately following the cor- 
rect response, that the 30-sec. gradient was obtained. The result of 
this difference in procedure was probably to make the difference be- 
tween the correct and incorrect responses greater. The removal of 
the bar following response would certainly affect the postural adjust- 
ments to the bar and those immediately following the act of pressing 
the bar. Thus the pattern of response following removal of the bar 
would always be reinforced, while that following the incorrect re- 
sponse and the bar remaining in place would never be reinforced. 
Apparently the difference between the two motor patterns was not 
enough to provide differential secondary reinforcement unless this 
additional difference were introduced. Another factor is that the 
occurrence of massed, non-reinforced incorrect responses with the 
bar in place would tend to eliminate that response through the process 
of extinction. 

If there is no true primary gradient of reinforcement, as previously 
suggested, the question may be raised as to why even the short 
gradient effect was obtained in the present experiment. One plau- 
sible answer is that the basis of the secondary reinforcement is the 
perseverative stimulus trace or sensory after-effect from the black or 
white stimulus. So long as any traces of the black and white stimuli 
remain discriminably different at the time of reward, there is the 
possibility of differential secondary reinforcement. The stimulus 
trace from the postitve white stimulus in the choice chamber could 
thus acquire secondary reinforcing potency through generalization 


7 The curve fitted to the data from the present experiment in Fig. 7 is of the same type as 
the one in Fig. 3. The equation is: 
I 





3704 + 2.837" 
where S is the slope of the learning curve and T is the time of the delay of reward. The equation 
fitted by Perin to his data and reproduced here is: 


S = 1.6-107-7 — .043T + 1.45. 
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from its perseverative trace which is contiguous with the food reward. 
The data suggest that the traces or the differences between them de- 
crease rapidly during the first sec., are at a very low level at the end 
of five sec., and have more or less disappeared at the end of 10 sec. 
or are sufficiently different from those at the choice point to be beyond 
the range of generalization. 

The black-white goal box experiment demonstrates clearly the 
effect of secondary reinforcement in delayed reward learning. The 
stimulus, being present at the time of reward, acquired secondary 
reinforcing properties. Thus, upon orienting toward and entering 
the white alley, the S received immediate secondary reinforcement. 
The result was a marked speeding up of learning, even though the 
food reward itself was delayed for five sec. 

The fact that animals which were forced to make consistently 
different motor responses to the two stimuli learned at a significantly 
faster rate than the Ss for which this was not the case, may be inter- 
preted in terms of proprioceptive secondary reinforcement as de- 
scribed in the Perin and Perkins experiments. In this group there 
were not only different visual traces following the different choices 
but also entirely different afferent traces resulting from the different 
postural and motor adjustments required in the two alleys. It is 
reasonable to assume that in the rat such differential proprioceptive 
stimulation effects would continue longer than the visual after-effects. 
As stated above, the limits of delay with which learning may occur 
would depend on the length of time during which these two propri- 
oceptive traces remain discriminably different. During the period 
in which this difference between the traces remains, the food reward 
following correct responses will produce and strengthen secondary 
reinforcing properties for the trace of the stimuli produced by the 
correct response. No such reward is associated with the traces of 
the incorrect response. As long as the trace of the correct response is 
within the range of the generalization gradient of the proprioceptive 
pattern stimulating the organism at the time of the response in the 
white alley, this proprioceptive pattern will acquire secondary rein- 
forcing properties. It is this immediate secondary reinforcement 
that accounts for the superiority of the differential motor response 


group over the five sec. delay group with no such differential motor 
response. 


SUMMARY 


1. Groups of white rats were run on a black-white discrimination 
problem with delays of reward of 0, 0.5, 1.2, 2, 5, and I0 sec. 

2. A very steep delay of reinforcement function was obtained 
within this range, with no learning by three of five Ss in the I0 sec. 
group. 
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3. When immediate secondary reinforcement was introduced by 
allowing the animal to eat in a goal box of the same color as the 
positive stimulus, learning with delayed reward was greatly facili- 
tated. 

4. When animals were forced to make characteristically different 
motor responses to the black and white stimuli, they learned at a 
significantly faster rate than animals which received equal delay, but 
made no such characteristically different motor adjustments. 

5. The data are consistent with a theory which assumes no “‘pri- 
mary” delay of reinforcement gradient, but accounts for learning 
under delayed reward conditions in terms of some type of immediate 
secondary reinforcement. Such secondary reinforcement may be 
based upon proprioceptive stimulation resulting from the response 
and continuing until the moment of the reward. The proprioceptive 
pattern accompanying the correct response acts as a secondary 
reinforcing agent by virtue of its similarity to the traces which on 
previous trials have lasted until the reward. In the usual visual 
discrimination learning situation no differential proprioceptive stimul! 
follow correct and incorrect choices. In such situations, learning is 
possible with only very short delays of reward. What learning 
does occur may be attributed to immediate secondary reinforcement 
from the visual stimuli. This secondary reinforcement is presumably 
based on traces of the visual stimuli which continue until the time of 
the primary reward. 


(Manuscript received March 14, 1947) 
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REACTIVELY HETEROGENEOUS COMPOUND TRIAL 
AND-ERROR LEARNING WITH DISTRIBUTED 
TRIALS AND SERIAL REINFORCEMENT! 


BY CLARK L. HULL 


Institute of Human Relations, Yale University 


INTRODUCTION 


This study differs sharply from the three earlier ones of the present 
series (1, §, 7) in that it involves serial reinforcement. From a theo- 
retical point of view it is particularly to be contrasted with hetero- 
geneous compound trial-and-error learning by terminal reinforcement. 
It will be recalled (5) that in this latter type of learning the maximum 
rate of acquisition of habit strength, according to the uncomplicated 
gradient-of-reinforcement principle, is to be expected to occur at 
the final choice point. The question naturally arises as to what role 
the gradient of reinforcement will play where reinforcement occurs 
serially, i.e., after each one of a series of correctly performed acts. 

There is much evidence (4, p. 135 ff.) that the immediate giving 
of food to a hungry animal at the conclusion of a series of acts sets 
up a gradient of habit strength whose maximum is at the act directly 
preceding the feeding, and that the remaining acts of the series re- 
ceive a strength which decreases according to a negative growth 
function of their respective temporal remoteness from the point of 
reinforcement. In case a four-act sequence is reinforced after each 
act, it is assumed that each reinforcement will generate a separate 
gradient of habit strength, all such gradients being identical except 
as to the point of origin. Naturally such gradients will overlap and 
their habit strengths will summate. It was mainly to investigate 
empirically some of the implications of the above assumptions that 
the present investigation was undertaken. 


APPARATUS, SUBJECTS, AND EXPERIMENTAL PROCEDURE 


The apparatus employed in this investigation was the same as that reported in the last 
published study of the present series (5), the floor plan of which is shown in Fig. 1. The Ss 
utilized were 48 male albino rats approximately go days of age at the beginning of the experiment; 


: 





1 This is the fourth of a series of coordinated studies from the Institute of Human Relations 
on the subject of compound trial-and-error learning. Throughout the present experiment Ruth 
Hays acted as laboratory technician. She also scored the original graphic records and tabulated 


the primary data. The writer is indebted to Charles B. Woodbury for the calculation of the 
statistical reliabilities. 
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they were purchased from the Albino Farms breeding laboratory. Their diet was made up of 
Gaines dog biscuit supplemented once a week by lettuce and codliver oil. The training trials 
were given under a 23-hour food privation. The animal was put into the maze at S (Fig. 1). 
During the prelimi 
open, at each choice point during the main or training trials, three of the valves were locked and 
one was free to lift at the rat’s pressure. When the animal reached point F he found a sphere of 
moist food mash lying on a small rectangle of memo paper. These spheres, made with a special 




















apparatus, were one-fourth in. in diameter.2 The dotted line in Fig. 1 shows the correct course 
’ Loo f+ hos I i am ae is ; - pa 
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[ 8 a 
P I I Ir i'd 


Fic. 1. Diagrammatic representation of the floor plan of the maze used in the present study. 
The starting point is represented by S, and the points at which food was placed, by the several F’s. 
P represents one of five partitions with a 2.5-in. passageway in its center which forced the animal 
to make its choice of the doors from the same relative position of the runway, regardless of the 
previous choice. The Roman numerals give the ordinal number of the several banks of valves 
or doors from which the animal must choose. The dotted line represents the correct pathway 
of one of the 24 choice combinations. Only the door in each combination marked by the dotted 
line could be pushed open by the animal in question. The valves or doors are numbered differ- 
ently for each animal in such a way that the valve which is correct at a given choice point has 
the same number as the choice point in question, as shown by the numbers placed in the door 
positions at choice point IV. 


taken by a rat in one of the 24 different door combinations possible to set up in this apparatus 
so that a rat would never pass through a door in a given position twice in progressing through the 
maze. ‘Thus, two of our 48 animals were run on each of the 24 possible door combinations. The 
training of one set of animals by the 24 door combinations was completed before the second set 
of 24 animals was begun. All animals alike were given 50 trials, one trial per day except on 
week-ends when 47 hours intervened between successive trials. The sequence of each set of the 
24 animals in each group was varied systematically from day to day in order to avoid any learning 
by odor cues left by the tracks of the preceding animal. 


RESULTS 


The critical results of the investigation as regards the compounded 
gradient of reinforcement may be seen in Table I, which shows the 
mean first-choice errors made by the 48 animals at the respective 
choice points together with the standard deviations of the respective 
distributions and the standard errors of the corresponding means. 
These four means reduced to percentages based on their total are 
represented graphically in Fig. 2. Fig. 2 also shows a graph of the 
same type of learning (heterogeneous) with the same apparatus but 
with terminal reinforcement (5). An inspection of this figure shows 


?’ The animals always ate all four pellets provided in the apparatus. At the beginning of 
learning the rats ate each pellet as soon as they reached it. In the later stages of training, how- 
ever, some of the animals frequently would seize the first pellet encountered and, holding it in the 
mouth, would proceed at once to the next pellet. Usually at this point the animal would pause 
to eat both pellets, though sometimes an animal would carry two pellets on to the third pellet 
before eating. For some reason unknown to us, the animals seem never to have carried any 
pellets from the third pellet position to the fourth, or final, position. 











TRIAL-AND-ERROR LEARNING 19 


TABLE I 


Tue MEANS OF THE ToTAL First-Cuoice Errors MApe at THE SEVERAL Cuolce PoIntTs 
TOGETHER WITH THE SD’s oF THE DISTRIBUTIONS AND THE 
STANDARD ERRORS OF THE MEANS 

















Choice Point Mean | Odie. | ou 
I 20.6 ! 16.5 | 2.4 
II 27.3 14.4 2.1 
III 29.2 14.7 2.1 
IV 24.1 14.4 2.1 





that whereas the terminal reinforcement produces fewer errors at 
the final choice point than at the first, the serial reinforcement pro- 
duces fewer errors at the first choice point than at the finalone. The 
two studies agree in showing a greater number of errors at the second 
and third choice points than at either extreme, exactly as is to be ex- 
pected in reactively heterogeneous compound trial-and-error learning 
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PER CENT POOLED FIRST-CHOICE ERRORS 
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Fic. 2. Graphic representation of the percent of pooled errors of all three types at the 
several choice points of heterogeneous compound trial-and-error learning by serial reinforcement 
(the present study) and, for purposes of comparison, the same by terminal reinforcement by a 
former study (5). Note that by terminal reinforcement the fewest errors occur at the final choice 
point, whereas by serial reinforcement the fewest errors occur at the first choice point. 
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TABLE II 


Tue STATISTICAL RELIABILITIES OF THE DIFFERENCE BETWEEN THE 
SEVERAL Means SHOWN IN TABLE I 




















rere ae Sais » os | oM\—-Ms — “a 
I, Il .47 69 | 2.31 2.90 99.81 
I, If 31 8.6 | 2.65 | 3-24 99.96 
I, 1V o8 3.5 3.04 | 1.16 87.70 
II, il 2 1.9 | 2.54 | 75 77-34 
II, IV 06 —3.2 2.85 1.11 86.65 
III, IV —.11 | —5.1 | 3.12 | 1.62 | 94-74 








(5). The statistical reliabilities of the six differences existing be- 
tween the four means are given in Table II; that between choice 
points I and IV is only 87.7. Nevertheless the greater ease of learning 
at the anterior end of reactively heterogeneous sequences by serial 
reinforcement can hardly be doubted, because it has been found in 
two additional quite independent studies, as yet unpublished. 

A somewhat more detailed view of the outcome of the present 
experiment, particularly in regard to the origin of the distortion of 
the joint gradient of reinforcement as caused by the heterogeneous 
reactions which when generalized become errors, is given in Table III. 


TABLE III 


ANALYTICAL TABLE SHOWING THE NUMBER OF First CHOICES OF THE VARIOUS KINDS 
Mabe BY THE 48 ANIMALS OF THE PRESENT STUDY 


The numbers representing the correct choices are set in bold-faced type. The valves or 


doors (see Fig. 1) are numbered according to the choice points at which they are correct, i.e., free 
to allow passage. 



































Sets of 10 Trials — 
Choice No. of Total of : 
Point Valve | 50 Trials Sach Valve 
I 2 3 4 S 
I 183 273 | 289 | 329 338 1412 58.8 
| 2 95 | 74 73 63 | 62 367 15.3 
3 106! 66 50 34. COS 26 282 11.8 
| 4 | 96 | 67 68 ; 54 | 54 | 339 14.1 
| | 
I 105 | 37.—«|| 22 (| 26 | is | 208 8.7 
I] 2 | 4133 sO | 239 | 242 | 286 1092 45-5 
3 105 | 22 | «WO | 112 | g2 541 22.5 
4 137, | 129 | 109 | 100 84 | 359 | 23.3 
I 8 | 58 | 32 | 32 | 20 | 226 | 9.4 
I] 2 07 80 | 77 | +68 | 83 | 412 17.2 
3 | 156 178 | 197 | 232, , «4236 =| 999 OC 41.6 
4 | 133 | 164 74 | 15h | gt 763 | 31.8 
I 86 | 75 | 26 | 25 | 20 | 232 | 9.7 
IV 2 | too | 97 | 88 | 79 70 «(| $340 | 18.1 
3 | 133 | 6 | 95 | 99 96 492, | 20.5 
4 | 161 | 239 | 27t | 277 294 1242 | 51.7 
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Attention is particularly called to the final column. For easy com- 
prehension these latter values are represented graphically in Fig. 3. 
Here the upper curve (solid line) represents successes instead of the 
errors of Fig. 2. Here once more we observe the greater ease of learn- 
ing at the first choice point. Glancing at the six generalization 
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Fic. 3. Graphic representation of the percent choices at the several choice points, both 


correct and erroneous, on the entire 50 trials of a compound trial-and-error learning by serial 
reinforcement 


gradients below (broken lines) we notice, just as in the case of ter- 
minal reinforcement (5), that the initial slope of each generalization 
is markedly steeper in the case of the perseverative generalizations 
than in the case of the antedating or anticipatory generalizations. 
It is also noteworthy that with both types of generalization the initial 
section of five of the six slopes is steeper as one moves from right to 
left in the figure. As a final observation of the generalization 
gradients it may be pointed out that the antedating gradients fall 
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progressively as the distance from the point of.origin increases, 
whereas the two longest perseverative gradients actually rise a little 
at the greater distances from the point of origin. 

As in the case of the terminal reinforcement (5), the slopes of the 
initial segments of the perseverative generalization gradients of the 
present experiment were roughly determined by calculating the dif- 
ference in percent of choices of the first door at choice point I (at 
which reinforcement occurred) and at choice point II (at which non- 
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Fic. 4. Relative steepness of the initial segments of the major perseverative and antedating 
generalization gradients as a function of the successive sets of 10 learning trials 


reinforcement of the reaction reinforced at choice point I occurred), 
at each of the five stages of training. Parallel determinations were 
made for the fourth-door choices at choice points IV and III. These 
determinations were made primarily to secure an indication of the 
influence of the differential reinforcement involved in the learning 
upon the slope of the two generalization gradients. Both sets of 
results are represented graphically in Fig. 4. There it may be seen 
that: 


1. The initial segments of both the perseverative and the ante- 
dating generalization gradients increase progressively in steepness 
as the number of differential reinforcements increases. 
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2. The differentiation occurs much more rapidly in the case of 
perseverative generalization than in that of antedating generalization. 

3. The difference between the two types of generalization in 
the present experiment involving serial reinforcement is noticeably 
greater than in the parallel experiment (5) involving terminal rein- 
forcement. 


DISCUSSION 


At this point in the exposition it may be well to elaborate on the 
theoretical implications of the fact that serial-reinforcement learning 
eventuates in results which are quantitatively opposite those observed 
in terminal-reinforcement learning. In this connection it may be 
observed that the uncomplicated gradient of reinforcement is as- 
sumed, on the basis of a certain amount of evidence, mostly indirect 
(4, p. 160), to be 

m’ = M' 1o-* 
in which: 


M’ is the maximum reaction potential, here taken as 3.1 units * 
producible under the learning conditions involved, by the number 
of reinforcements employed; 

m’ is the reaction potential produced by the above conditions 
with a given delay in reinforcement; 

j is a constant here arbitrarily taken for purposes of illustration 
as .03; and 

t is the delay in reinforcement which in the present experiment 
is assumed to be, on the average, 10 sec. between adjacent choice 
points, 1.e., 


Ito IV = 30 sec. 
II to IV = 20 sec. 
III to IV = 10 sec. 
IV = osec. 


Substituting in this equation for the value of the gradient at 30 
sec. delay in reinforcement, we have 


m’ = 3.1 X 107-00 
= 3.1 X ——; 
7-944 
m' = .39. 
Substituting similarly in the same equation the three remaining de- 
lays, we have as the several values of the simple gradient of rein- 


3 The unit is the standard deviation (¢) of the variability in reaction potential of the act 
involved (see 7, p. 205 ff.). 
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forcement of the maximum length here involved, 


delay in reinforcement 30 sec. 20 sec. 10 sec. O sec. 

value of m’ 39 .78 1.55 3-10 
But with gradients of reinforcement originating at all four choice 
points of the present experiment, we naturally have different numbers 
of gradient-of-reinforcement values being summated at each point. 
These, suitably arranged, are shown in Table IV. 


TABLE IV 


Tue THEORETICAL GRADIENT-OF-REINFORCEMENT VALUES WHICH ORIGINATE IN THE SEVERAI 
REINFORCEMENTS AND WHICH FUNCTION AT THE RESPECTIVE CHuoIcE Points oF A Four- 
Cuoice Compounp TRIAL-AND-ERROR LEARNING SITUATION, TOGETHER WITH THE 


HypoTHeTticaAL SuMMATION (+) oF THE VALUES AT Eacu Cuoice Point 














First Choice Second Choice Third Choice Fourth Choice 
Point Point Point Point 
First reinforcement 3.10 
Second reinforcement 1.55 3.10 
Third reinforcement .78 1.55 3.10 
Fourth reinforcement 39 .78 1.55 3.10 
Physiological summation (+) 3-45 3-42 3-35 3.10 

















The method of calculating the physiological summation (+) of 
the values in each of these columns is as follows. ‘Taking the pair of 
values from the third choice point, we have 


IO X 1.55 
3.10 + 1.55 = 3.80 + 1.55 — 5 Vi 22 





where M is the maximum reaction potential possible to obtain in the 
present learning situation with an unlimited number of reinforce- 
ments. Inthe present situation M is taken as 3.7. Substituting the 
value of M and solving, we have 


3.10 + 1.55 = 3.35. 


By applying this method successively to a pair of values, then to the 
result so secured in conjunction with a new value, and so on, the 
physiological summation (+) of Table IV was calculated. These 
values represent the hypothetical gradient of serial reinforcement. 

An examination of this set of serial-reinforcement values shows 
that: 


1. The minimum value falls at the final act of the series. 

2. The maximum value falls at the first act of the series. 

3. The difference between the adjacent values in this gradient 
is maximal (.25) at the posterior end of the series and minimal (.03) 
at the anterior end of the series. 
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It may be added that, according to the final row of values given 
in Table IV, the number of successes at the final choice point (IV) 
should, except for the accuracy limitations due to sampling, be the 
same by serial reinforcement as for terminal reinforcement, whereas 
the corresponding mean at choice point I should be appreciably higher 
for serial reinforcement. A glance at Fig. 2 will show that this partic- 
ular theoretical expectation does not occur. 

A second matter worthy of theoretical consideration in the above 
data concerns the generalization gradients. The explanation of the 
general fact of error generalization in heterogeneous compound trial- 
and-error learning, as well as of the characteristic steeper initial slope 
of the perseverative gradients, has been presented in an earlier article 
(5). We have here, however, a phenomenon differing from the re- 
sults of this type of learning by terminal reinforcement which, while 
small in magnitude and probably lacking in statistical reliability, is 
of some systematic interest. ‘This is the fact that whereas all of the 
antedating generalization (error) gradients continue to fall through- 
out their course, both of the more extended perseverative gradients 
rise slightly after their initial fall. Thus, the gradient originating at 
choice point II amounts to 17.2 percent at III and rises to 18.1 at IV; 
and the gradient which originates at choice point I amounts to 8.8 at 
[[, then rises to 9.4 at III, and to g.7 at IV. 

While seemingly paradoxical from the point of view of stimulus 
generalization, which has been assumed to fall continuously toward 
a sizable asymptote (4, p. 185), it must be remembered that in the 
present reactively heterogeneous situation the error score is not 
determined merely by the strength of the single generalization tend- 
ency being plotted but jointly with the other reaction potentials which 
are in competition at any particular choice point. Now, an in- 
spection of the graphic representation of the perseverative traces, 
both in this (Fig. 3) and the previous reactively heterogeneous study 
(5), suggests that the exponent of the generalization gradient equa- 
tion must be so large that this curve has fallen very nearly to its as- 
ymptote by one choice-point distance removal from the place of rein- 
forcement. This means that a further degree of remoteness will 
cause hardly any further fall. However, the strength of the major 
competing reaction potential, that of the reactions reinforced at point 
III and especially at point [V, is markedly diminished, as shown in 
the lowest row of values in Table IV. It would naturally follow, 
other things equal, that the perseverative generalization tendency 
originating at choice point I would fall sharply at II, would rise a 
little at III, and would rise a little more at IV. 

With a basis of the gradient of serial reinforcement available it is 
evident on the logic elaborated in the earlier article already referred 











26 CLARK L. HULL 


to (§) that the heterogeneous generalized reactions would tend to 
accumulate as errors at the two central choice points. Actually, the 
present experiment shows that the maximum of errors falls at the 
third choice point. Now it happens that this error picture (Fig. 2) 
is in a formal sense the same as that of the difficulty of human Ss in 
the rote learning of nonsense syllables. A typical set of the latter 
results as reported by Hovland (2) is represented in Fig. 5. There 
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Fic. 5. Mean number of reaction failures of Ss during the learning of 8-unit, 11-unit, and 
14-unit lists of nonsense syllables. Note the formal similarity between these curves, particularly 
the 8-syllable one, and the serial reinforcement curve of Fig. 2. (Reproduced from 3, p. 178, 
originally taken from Hovland, 2.) 


it may be seen that: 


1. The minimum learning difficulty in nonsense syllable series 
appears at the initial syllable. 

2. The final syllable presents greater learning difficulty than does 
the first syllable. 

3. The syllable presenting the greatest difficulty of each length 
of series appears at neither end of the series but at a point between 
the middle of the series and the posterior end. 
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4. The longer the series, the greater is the difficulty of learning. 

5. The shorter the series, the less is the difference in difhtculty be- 
tween the two ends. 

6. The longer the series, the closer, relatively, is the point of 
maximum learning difficulty to the middle. 


On the other hand, a comparison of the two types of learning, 
from the point of view of the processes involved, shows that: 


1. Rote learning of the type here under consideration is reactively 
heterogeneous since all of the syllables are different. On this as- 
sumption each act of the linear maze corresponds to a syllable of the 
rote series. 

2. The reinforcement (presumably secondary in nature, 4, p. 84 
ff.) in rote learning by the method of anticipation occurs when each 
succeeding syllable is presented and so is necessarily serial in nature. 

3. Rote learning manifests both antedating and perseverative 
generalization (6, pp. 75 ff., 84 ff.), presumably for the same reason 
that compound trial-and-error learning does (perseverative stimulus 
traces). 

4. Antedating generalizations are stronger than are the perse- 
verative generalizations in rote learning (6, p. 75 ff.), presumably for 
the same reason as in heterogeneous compound trial-and-error learn- 
ing with animals (5). 

5. Antedating generalizations, at least, in rote learning grow 
steeper as training progresses, much like that shown in the rat maze 
in Fig. 4 (6, pp. 82, 83) and presumably for the same reasons. 


With this striking parallelism in the conditions of the two types 
of learning, the equally striking quantitative parallelism is not sur- 
prising. ‘Taken together the two parallelisms strongly suggest, de- 
spite the unsatisfactory statistical reliability of the difference between 
choice points II and III in the present experiment, that essentially 
the same principles are operative in the two cases notwithstanding 
the great difference in the organisms and the experimental arrange- 
ments involved. Thus, once more, lower animal experimentation 
promises to throw light on a complex human process. ‘The paral- 
lelism is of sufficient potential significance as to be worthy of further 
investigation. 


SUMMARY 


An experiment has been performed in which 48 male albino rats 
learned a linear maze with four choices at each of four different choice 
points. The correct act at each point was different and each was re- 
warded immediately after being performed. It was found that: 
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1. The first act of the series was learned more easily than the last, 
which is exactly opposite to the results by terminal reinforcement. 

2. The second and third acts of the series were learned with 
vreater difhculty than was either the first or last act, exactly as in 
the case of the corresponding learning by terminal reinforcement. 

3. The third act of the maze was learned with maximum diff- 
culty, exactly as in the case of the corresponding learning by terminal 
reinforcement. 

4. Both antedating and perseverative generalizations display 
themselves, the antedating generalization being distinctly the stronger 
much as in terminal reinforcement. 


5. Both antedating and perseverative generalizations were pro- 
gressively weakened by the differential reinforcement involved in the 
learning process, but there was a greater difference between the slopes 
of the two types of generalization in serial reinforcement than in 
terminal reinforcement. 

6. ‘The longer perseverative error gradients tend slightly to rise 
at the ends of their second and third segments. 

7. An explanation of the greater ease of learning at the anterior 
end of the maze as compared with the posterior end is offered as 
being due to the summation (+) of portions of separate gradients of 
reinforcement set up by the separate reinforcements in the series of 
four given during each maze performance. 

‘8. The tendency of the longer perseverative error gradients to 
rise near their ends is explained as due to progressively weakened 
competition as the posterior end of the maze is approached. 

g. It is also pointed out that substantially the same principles as 
those operating in heterogeneous compound trial-and-error learning 
with rats presumably are operating in human rote learning, which 
would account for the striking resemblance the results with rote 
series have to those yielded by the present experiment. 


(\lanuscript received January 6, 1947) 
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RETROACTIVE AND PROACTIVE INHIBITION AFTER 
FIVE AND FORTY-EIGHT HOURS 


BY BENTON J. UNDERWOOD 


Northwestern University 


INTRODUCTION 


Three independent studies (3, 5, 8) have shown that retroactive 
inhibition (RI) is greater than proactive inhibition (PI) when re- 
tention is measured shortly after the learning, e.g., after 20 min. By 
RI is meant the decrement in retention of a task as a consequence of 
another task intervening between the original learning and the re- 
tention test. By PI is meant the decrement in the retention of a 
task as a consequence of the prior learning of another task. Al- 
though RI has been shown to be greater than PI following short time 
intervals, there is reason to believe that this difference is not inde- 
pendent of the length of the time interval. ‘Two lines of evidence 
support this. 


1. Studies concerning the relationship between RI and the time 
between learning and recall suggest that for verbal learning RI re- 
mains relatively constant after varying time intervals. McGeoch’s 
study (2) demonstrates this. In a control series, Ss learned serial 
lists of 10 adjectives to a criterion of three successive perfect trials 
and then recalled after 20 min., 1, 24, 48 and 144 hrs. The recall for 
these conditions was compared with the recall for experimental condi- 
tions in which an interpolated list was learned immediately after the 
original learning. Under these conditions there was a slight tendency 
for RI to increase as a function of the time intervals as shown by the 
recall scores. However, because the recall scores were not consistent 
in the trend and because relearning measures did not demonstrate 
such a relationship, McGeoch concludes that RI remains relatively 
independent of the length of the time between learning and the 
retention tests. 

2. An analysis of the operations which define PI will show that 
PI must increase as a function of the time interval. The usual 
memory drum procedure allows six to eight secs. ‘rest’ between trials. 
To produce PI, this fest interval is simply lengthened after some ar- 
bitrarily chosen level of learning is reached on the second list. After 
this extended rest interval the S continues learning the second list. 
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PI is determined by comparing the recall after the rest with the recall 
of a list after a comparable rest period but which had no list preceding 
it. ‘This may be diagrammed as follows. 


Experimental Learn List A Learn List B — rest — Learn List B 
Control  -------- Learn List B — rest -— Learn List B 


If the rest interval is the usual six to eight secs., no inhibition after 
the interval will appear (unless the interval comes very early in the 
learning of the second (B) list in which case some associative inhi- 
bition —negative transfer—might be evidenced). However, if the 
rest interval during the learning of B is increased to 20 min., signifi- 
cant PI will be measured. ‘These facts suggest that PI must increase 
as a function of the time between learning and the retention test. 


Proceeding on the assumption that PI increases as a function of 
the time interval following learning, and that RI remains constant 
over a comparable period, at some point following learning the first 
task should be as well recalled as the second. The present experi- 
ment was designed to test this implication by measuring the retention 
of two successively learned lists after a lapse of five and 48 hours. 


Il. Metruop 


Subjects and Materials.—The volunteer Ss for this experiment were 12 men and 12 women. 
These Ss learned lists of 10 paired-associates presented on a Hull-type memory drum. The 
paired-associate lists were constructed of two-syllable adjectives according to rules given else- 
where (7). For any one condition the Ss learned two lists which were of the A-B, A-C relation- 
ship, the stimulus word being the same in both lists but the responses different. 

The two lists for a particular condition were typed in capital letters on heavy white cloth 
tape, and in order to minimize serial learning three different orders of presentation of pairs within 
a list were used. A two-sec. anticipation period was used and eight sec. followed each trial. 
/; kept a complete record of responses throughout learning and relearning. 


TABLE I 


SEQUENCE OF ConpiTions Usep To MEASURE THE RETENTION OF Two SUCCESSIVELY 
LeaRNeED Lists AFTER Five anv 48 Hours 


The Initial Learning of Both Lists was Carried to a Criterion of One Perfect Trial while the 
Relearning was to Two Perfect Trials. Mean Length of Obtained Rest Interval is Shown 

















‘ iad Firs Second aid Recall and 
Condition | ty | List Mean Length of Rest te ondeg 
PI-5 | A-B | A-C | 4 hrs. §5 mins. A-C 
RI-s | A-B | A-C 4 hrs. 58 mins. A-B 
PI-48 \-B A-C 47 hrs. 56 mins. A-C 
RI-48 \-B A-C 47 hrs. 54 mins. A-B 








Specific Conditions.—The four experimental conditions are outlined in Table I. For each 
condition two lists were learned in immediate succession and then, depending on the condition, 
either the first or second was recalled and relearned after five and 48 hrs. The actual mean time 
of the rest interval is shown in Table I. It should also be noted that control conditions are not 
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included in the design, so that the absolute amount of inhibition is not measured. It is assumed 
that these conditions would produce both RI and PI. 

Each S served in all four experimental conditions but these conditions occurred in a different 
order for each S. Lists were held constant and conditions counterbalanced. With 24 Ss a 
complete counterbalancing of conditions is obtained, so that the difference in scores arising from 
unequal lists and positive transfer are equally distributed over all conditions. 

The four experimental days were preceded by two practice days. Two lists were learned on 
each of these days and the Ss were given practice in recalling both the first and second lists. 

From each pair of experimental lists one was selected as the critical recall list and was used 
for that purpose for a particular experimental day in the series of four regardless of the condition. 
Hence, if S were working under Condition PI-5 on the first experimental day, and if the two lists 
used on that day are designated X and Y, Y was learned first followed by X and then after five 
hours X was recalled. Another S, working under RI-5 on the first experimental day, would learn 
X first, then Y, and after five hours X would be recalled. The Ss thus recalled the same lists 
whether working under the RI or PI conditions. Following the relearning of the critical list, 
the other list was also relearned. The critical list was always learned and recalled on the left 
side of the drum, with the interfering list always being learned on the right side of the drum. 

The instructions at recall were made as specific as possible, so that the Ss were fully informed 
as to which list they were expected to recall. This was intended to minimize failures of recall 
attributable to S’s not knowing what was expected of him. The instructions just prior to recall 
were: “We shall go back to the first (or second) list which you learned at the last session. The 
first (or second) list learned is the list to your left. Be sure and give as many responses as you 
can on the first trial and continue with the list until you can say it correctly for two successive 
perfect trials.” 

All statistical computations to determine the significance of the differences between means 
are based on the direct-difference method for handling correlated measures (1). ‘The computa- 
tions are based on 24 Ss and interpretations are made for 23 degrees of freedom. A ¢-value of 
2.807 indicates a difference which is significant at the one percent level of confidence, while a ¢ of 
2.080 indicates a difference which is significant at the five percent level of confidence. 


III. Resutts 


Original Learning.—Table II shows the mean number of trials to 
reach the criterion of one perfect recitation on the four critical lists. 


TABLE li 


Mean TRIALS REQuIRED TO REACH THE CRITERION OF ONE PerRFecT RECITATION 
ON THE INITIAL LEARNING OF THE CRITICAL LisTs 


Condition Mean Mean 
PI-5 11.00 1.01 
RI-s 12.00 1.06 
PI-48 10.83 75 
RI-48 12.17 1.10 


On the PI conditions these lists were preceded by the learning of 
another list and on the RI conditions they were followed by the learn- 
ing of another list. Due to positive transfer, this order tends to make 
for faster learning of the PI (second) lists. This is suggested by the 
differences in the means of the PI conditions as compared with the 
RI conditions. For the single lists shown in Table II, and for only 24 
Ss, the evidence for positive transfer is not significant. However, 
if all four first lists for the experiment are combined into a single 
mean value for each S and these means are compared with the means 
of all four second lists, statistically significant evidence for positive 
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transfer is found. The grand mean of the first list is 12.96 trials and 
for the second lists it is 11.17 trials. The difference, 1.79 + .61, 
gives a t ratio of 2.93 which indicates significance beyond the one 
percent level of confidence. Although there is some evidence for 
associative inhibition on the first anticipation trial of the second lists 
the difference will not approach statistical significance. It may be 
concluded that with the high degree of learning of the first list used 
here, and with moderately practiced Ss, positive transfer will occur 
if the second list is learned to the same criterion as the first. Meas- 
urable associative inhibition is not present. 

Inhibition at Recall.—The mean number of correct anticipations on 
the first three relearning (recall) trials is shown in Table III and in 
Fig. 1. It will be seen that after five hours recall of the second list is 


TABLE III 


RETROACTIVE AND Proactive INuIBITION AFTER Five AND 48 Hours as SHOWN BY THE MEan 
NuMBER OF CorrRECT ANTICIPATIONS ON THE First THREE RELEARNING TRIALS 








Relearning Trial 





| 
| 
| 
Condition | I 2 3 




















\f | oM M oM M oM 
PI-s 3.13 | 45 7.13 -32 7.88 35 
RI-s | 1.71 | 36 5.29 49 7.00 34 
PI-48 2.0! | 37 6.63 .28 8.13 .27 
RI-8 | | 2130 | fl 6.13 39 7:75 33 








better than recall of the first list—PI is less than RI. On the first 
trial the difference between the means of Conditions RI-5 and PI-s 
is 1.42 + .38, which gives a t of 3.74, while on the second trial the 
difference is 1.84 + .43 (t = 4.28). These results are thus analogous 
to previous findings for much shorter time intervals. 

After 48 hours there is not a significant difference in the mean re- 
call scores of the first and second lists. The fact that a significant 
difference between RI and PI occurs after five and not after 48 hours 
seems to result form the combination of two changes neither of which, 
when taken alone, will meet the requirements of rigid statistical 
significance. ‘There is, first, a decrease in the recall of the second list 
between five and 48 hours. This decrease, which might be expected 
on the basis of a usual forgetting curve, amounts to a mean difference 
of 1.04 + .47 which gives a t ratio of 2.21. Secondly, there is the 
suggestion that the recall of the first list is better after 48 hours than 
after five hours. This increase in recall will satisfy no statistical 
tests but it is rather consistent throughout the first three relearning 
trials. The basic facts as shown by the recall scores are, however, 
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Fic. 1. Mean number of correct anticipations on the first three relearning trials after tive and 
48 hours. RI refers to the recall of the first list; PI to the recall of the second 


that after five hours RI is greater than PI whereas after 48 hours no 
such difference occurs. It is not known, of course, at what point be- 
tween five and 48 hours the recall of the two lists would become ap- 
proximately equal. 

Inhibition in Trials to Relearn.—-The mean number of trials re- 
quired to relearn the four critical lists to one perfect and two succes- 


TABLE IV 


RETROACTIVE AND PROACTIVE INHIBITION AS SHOWN BY THE MEAN TRIALS TO RELEARN THE 
CriticaL Lists To One PERFECT AND Two Successive Perrect TRIALS 














One Perfect Two Perfect 
Condition a shalicieteciall 
VU | oM J | oM 
_ | ——|—_——- 

PI-s 4.63 Dj S.71 2 
RI-5 6.58 83 7.75 .Q2 
PI-48 | 3 29 4.58 209 
RI-48 | 5 .48 5.63 9 





sive perfect trials is shown in Table IV. The relearning curves, 
plotted in terms of trials to reach successive criteria, are shown in 
Fig. 2. These measures of relearning tend to support the suggestion 
found at recall that the first list is less well retained after five hours 
than after 48 hours. The difference between RI-5 and RI-48 in 
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Fic. 2. Relearning curves showing the number of trials required to reach successive criteria. 
Relearning was carried to two perfect trials (2P) 


mean number of trials to reach the criterion of one perfect trial is 


2.04 + .83 which gives a t of 2.46, and to reach two successive perfect 
trials the difference is 2.12 + .84 (t = 2.52). 


Overt Inter-List Intrustons.—Table V gives the total number of 


TABLE V 
ToTtaL Overt Inter-List INtRUstIons AT RECALL 


First Recall Other 


Condition Trial Trials 
PI-s5 22 3 
RI-s 16 3 
PI-48 29 4 
RI-48 20 2 


intrusions occurring at recall. These intrusions are specific responses 
from the interfering list, i.e., from the second when the first is being 
recalled, and from the first when the second is being recalled. As 
indicated in the table, few intrusions occurred after the first trial. 
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[In terms of gross totals the frequency of intrusions is somewhat 
greater for the PI conditions than for the RI conditions, although 
none of the differences prove significant by a chi-square test. 

Table VI presents another method of analyzing the responses on 
the first recall trial. For each item at recall there are four possible 
outcomes: (I) correct response, (2) intrusion by specific response from 
interfering list, (3) other intrusions, such as giving a stimulus word 


TABLE VI 


RESPONSE ANALYSIS ON First RecaLtt TRIAL SHOWING THE ToTAL Responses ATTEMPTED, 
THE Percent INTRUSIONS, CorRECT RESPONSES, AND OTHER RESPONSES 














Correct 
iS ora — ee | 
Condition | Attempts Intrusions Responses Other 
~~ ee ee 
N | % N | % | N 
| —_—__—— —— | —_———_]---—— asain 
PI-5 g8 | 22 22 | 75 | 7 | I 
RI-s s9 16 27 4I | 70 2 ; 
PI-48 | 85 | 29 34 | 50 s9 6 7 
RI-48 | 79 | 20 | 25 51 | 65 | 8 10 
| | | 





or misplacing a response, and (4) no response. Using the total at- 
tempted responses as the base, the percent of occurrence of each of 
the first three types of outcomes is shown in Table VI. ‘The table 
shows clearly that fewer responses were attempted on the RI-5 con- 
dition though the percent correct of those attempted is not greatly 
different from the other conditions. ‘These facts seem to indicate 
that the Ss simply have fewer responses available on the RI-5 condi- 
tion than on the other three conditions. 


IV. Discussion 


The major findings of the present study are: (1) better five-hour 
recall of the second than of the first of two successively learned lists, 
(2) equal recall of the two lists after 48 hours, and (3) no decrease in 
the recall of the first list between five and 48 hours (with some indi- 
cation in both recall and relearning that the first list is retained better 
after 48 hours than after five). These findings, of course, will need 
corroboration by other methods and materials before their full im- 
plications can be stated. Nevertheless, the significance of the find- 
ings for theories of forgetting may be suggested tentatively. 

It may first be noted that the results support a conclusion that 
RI decreases as a function of the time intervals used here. Although 
no control condition was used it is a fairly safe assumption that the 
retention of a singly learned list would have shown a decrement be- 
tween five and 48 hours. Thus, when compared with the retention 
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of the first list in the present study, the absolute RI would be seen 
to decrease between five and 48 hours. McGeoch’s study (2), men- 
tioned earlier, showed that RI remained relatively constant over a 
period of 144 hours. McGeoch’s retention curve for the first list 
showed a sharp drop between one and 48 hours. There was no indi- 
cation of such a drop between five and 48 hours in the present experi- 
ment. McGeoch, however, used serial lists and the first list was 
learned to three perfect trials (17 to 18 repetitions of the list) while 
the second list was presented for only ro trials. Differences in the 
tasks and the degrees of learning may thus account for the discrep- 
ancies between the two experiments. 

‘The unlearning theory (4, 5) states that, during the learning of a 
second task, the first-task response tendencies are weakened or un- 
learned as a consequence of their nonreinforcement during the learn- 
ing of the second task. <A clear-cut implication of this postulate is 
that if two tasks are learned to an equal degree the retention of the 
first task will be less than the retention of the second. As stated in 
the introduction, this deduction has been confirmed when retention 
measurements are made after short intervals of time. In the present 
experiment the implication was substantiated after five hours but 
not after 48 hours. The unlearning theory will probably have to be 
revised to account for this finding. 

The present results also pose a problem in explaining the better 
retention of the first list after 48 hours than after five. Actually the 
most conservative interpretation that could be placed on the recall 
and relearning data is that retention of the first list is as good after 
48 hours as after five. Either of these two conclusions is somewhat 
contradictory to the usual conception of forgetting. ‘The passage 
of time is presumed to allow for the occurrence of processes which are 
detrimental to retention. Newman (6) has presented data which 
appear to have suggested this same contradiction. Newman showed 
that, for a group of three lists of nonsense syllables learned succes- 
sively, retention as measured by savings scores was somewhat better 
after 48 hours than after one hour (after one hour there was 30 per- 
cent saving in relearning, whereas after 48 hours there was nearly 
41 percent saving). When a single list was learned the expected 
decrement with the passage of time was found. Newman does not 
give independent data on each of the three lists which were learned 
successively so it cannot be determined whether all lists showed the 
enhanced recall after 48 hours. 

It seems likely that the present results as well as Newman’s re- 
sults require the postulation of some factor which opposes the usual 
forgetting processes. It may be possible to do this by amplifying 
the basic postulate of the unlearning theory. If it be assumed that 
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unlearning is a process having the basic characteristics of experi- 
mental extinction of a conditioned response, and if it be further as- 
sumed that some sort of recovery of these weakened response tend- 
encies takes place, the present results could be adequately handled. 

The recovery from the weakening or unlearning provides the 
factor which opposes the usual forgetting processes. Using the re- 
tention of the second list as a reference (the second list has not been 
unlearned), the results of the present experiment suggest that re- 
covery of the first list is not complete by the end of five hours but is 
complete at some point between five and 48 hours. On the basis of 
a recovery theory it would be expected that in Newman’s experiment 
the greater savings which accompanied the relearning after 48 hours 
should hold only for the first two of his three lists and not for the third 
since it would never have been weakened by unlearning. 


V. SUMMARY 


Twenty-four Ss learned lists of 10 paired two-syllable adjectives 
by the anticipation method. For each of the four experimental con- 
ditions two lists having the A-B, A-C relationship were learned to a 
criterion of one perfect recitation. The four conditions for recall 
were: (1) first list after five hours, (2) second list after five hours, (3) 
first list after 48 hours, and (4) second list after 48 hours. 

The results show that: 


1. After five hours the second list is better retained than the first. 

2. After 48 hours the two lists are equally well retained. 

3. The first list is as well retained after 48 hours as after five. 

4. Overt inter-list intrusions tended to be somewhat more frequent 
after 48 hours than after five and somewhat more frequent during 
the recall of the second list than during the recall of the first. 


In view of these findings it was suggested that the unlearning 
theory be revised. It was proposed tentatively that unlearning be 
identified as analogous to experimental extinction of a conditioned 
response and, furthermore, that like extinguished conditioned re- 
sponses, these unlearned verbal associations recover in strength with 
the passage of time. 


(Manuscript received May 5, 1947) 
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DRIVE SPECIFICITY AND LEARNING 


BY EDWARD L. WALKER 


University of Michigan * 


INTRODUCTION 


The question of the differential function of various operationally 
identified drive states and what the animal learns in a given experi- 
mental situation has considerable significance to learning theory. 
A series of experiments which demonstrates the concomitant estab- 
lishment of two opposed spatial responses yields results which at 
least pose difficult problems for learning theorists. 


Hull (1) and later Leeper (3) using single choice point mazes have shown 
that differential habits can be established in a situation in which only the 
character of the drive, the drive stimuli and the consequent consummatory 
response have been varied. By alternating, in an irregular fashion, hunger 
and thirst drives, both investigators were able to establish in their animals 
a tendency to perform the appropriate act to obtain food when hungry and 
to perform another act to obtain water when thirsty. All elements of the 
stimulus situation, except the drive stimuli, seemed to have been the same, 
and the results permitted interpretation of differential habit formation on 
this basis. 

Kendler (2) performed a similar experiment in which the animals were 
motivated to approximately an equal degree for both food and water 
during the preliminary training. ‘Thus, the internal drive stimulus compo- 
nents of the total stimulus patterns would be the same. Yet he was able 
to demonstrate that differential habits had been set up, for when his animals 
were motivated for only one of the incentives, they responded with the act 
appropriate to obtaining the goal for which they were motivated. Thus 
it seems necessary at least to look to some aspect of the consummatory re- 
action for the differentiating element. 

Spence and Lippitt (4) among others have performed an experiment 
which shows that even when preliminary training is carried out with the 
animals satiated for both food and water, when subsequently motivated for 
either, they are able to respond appropriately. “~*~; learning has ap- 
parently taken place with no experimentally imy. -ta..c difference in either 
the drive stimuli or the consummatory reaction, the latter being absent 
altogether in this situation. These results also seem to make difficult any 
explanation in terms of primary need-reduction, drive-reduction, tension- 
reduction or the like; although it is by no means disproved that they might 
be functional components in most learning. 


* The data for this experiment were coliected at Stanford University in 1946. 
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If neither differential internal drive stimuli nor a consummatory 
response is necessary for differential habit formation, what is neces- 
sary? 

One possible approach is offered by non-reinforcement or field 
theorists. In general they say that most learning problems are 
simply demonstrations of performance of something that has pre- 
viously been learned. ‘The learning itself consists in a ‘cognition,’ 
‘sign-gestalt,’ or knowledge of ‘what-leads-to-what,’ the acquisition 
of which requires neither specific motivation nor a consummatory 
response in the sense of actually eating the food. ‘These conditions 
serve only in the development of performance. It seems necessary 
only that the organism ‘have experience with’ the situation and the 
incentive being manipulated.! 


Spence and Lippitt (4, 5) performed another experiment in which all the 
requisites for the formation of a ‘cognition’ of what led to food seemed to 
have been met, yet the animals did not respond in a manner to suggest 
that such learning had taken place. They used a simple Y-maze, each 
arm of which led to a goal box. For one group of animals (the F Group) 
one arm led to food and the other to water. For the second group (the O 
Group) one box contained water and the other was empty. 

All animals were thirst motivated under 18 hours of water deprivation 
and were given five trials per day, two trials in which the animals were free 
to choose either side, and three trials in which they were forced to run to 
one side or the other. ‘Training was continued for 12 days or until both 
groups made all free choice turns to the water side. ‘Through the technique 
of forcing trials, Spence and Lippitt succeeded in giving their animals, by the 
time they had completed the criterion trials to the water side, as many trials 
to the opposite side as had been required to ‘reach’ the criterion of the 
water-going habit. Then the motivation was shifted to hunger and food 
placed in the maze for all animals. If the motivation during training was 
unimportant, the animals of the F Group (those with food in the maze during 
preliminary training) should have turned immediately to the food. This 
they did not do. All animals turned to the water side on the first day of 
hunger motivation. Continued training under these conditions at a rate 
of five free trials per day failed to produce any distinguishable difference 
between the two groups in the acquisition of the food-going habit. 


Thus it would seem that the development of a ‘cognition’ or re- 
sponse to an unwanted goal object, while it can take place without 
specific motivation for that goal object, cannot take place in the pres- 
ence of a strong drive for another incentive or goal object. 

The present experiment was performed to help verify this finding 
and to determine if possible, by minor modifications in the design, if 
factors other than the inappropriate strong drive contributed to the 


1A more precisely formulated review of such theories is that written by White (6). 
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apparent failure of the animals to establish a response to the un- 
wanted food during the original training. Specifically it was hoped 
that a substantial reduction could be made in the amount of frustra- 
tion involved in forcing the animal to the non-water side when 
thirsty, and perhaps greater assurance could be obtained that the 
animals made some reaction to the unwanted food. This was at- 
tempted by allowing the animal to obtain the appropriate goal object 
on every trial and by heaping the food in the other goal compartment 
so that the animal was forced at least to climb over the food to reach 
the water. 


EXPERIMENTAL PROCEDURE 
Subjects 


Twenty albino rats from the colony maintained at Stanford University were used in the 
experiment. They were naive and were 80-85 days of age on the first preliminary day. In 
addition, four rats were discarded: two because they exceeded an arbitrary limit of one hour at 
the choice point on the first preliminary day, one rat on the second day of training because of a 
fairly serious injury accidently received in the home cage, and a fourth at the same time to 
equalize the two groups. 


Apparatus 


The maze was 50 in. square with a 36-in. starting alley at one corner (see Fig. 1). The goal 
boxes were 10 in. square and removable. The left one was fitted with a standard water bottle, 
while the right one was empty or contained food according to the experimental conditions. ‘The 
maze path was four in. square and the top was covered with 3 in. hardware cloth. Gates to pre- 
vent retracing were located 10 in. before the choice point, four in. beyond the choice point 
on each side, and four in. from either entrance to each goal box, a total of seven gates. Cloth 
curtains hung two in. beyond the choice point in each direction obscured the view of the goal box 
until after the choice had been made. A small hurdle was located near either end of each choice 
alley. The left alley floor was covered with coarse hardware cloth, while the right choice alley 
was covered with fine copper hardware cloth. The screens on top of the choice point and on the 
goal boxes were hinged and held down by four-in. square bricks. This permitted removal of 
the animals from the goal box and the insertion of a forcing block at the choice point. The 
forcing block was a piece of wood cut to fit the alley on whichever side of the choice point 
it was required. ‘The gates were string controlled from behind the starting point. The maze 
was placed on the floor of a room next to that containing the home cages, and the starting alley 
led directly away from it. The room was illuminated by one large light which was placed slightly 
to the left and slightly beyond the maze so that the shadow of the wall in the left choice alley was 
quite narrow and the shadow in the right alley was more than one-half the width of the alley. 


Preliminary Training 


Day 1, Preliminary Familiarization.—All doors were open and neither food nor water was in 
the maze. The animals were placed in the maze four at a time and permitted to explore it for a 
period of 20 min. They were removed from whatever point they happened to be at the end of 
that time. 

Days 2 and 3, Test for Position Habit.—Each animal was given four trials per day, two free 
and two forced. In each case the animal had an equal number of trials to each side, regardless 
of the direction of the free choices. The gates were lowered behind the animal and he was 
removed from the goal box on the side of the run. On the basis of their performances on these 
two days they were divided into a food or F Group and a no-food or O Group. As a whole the 











42 EDWARD L. WALKER 

















Fic. 1. Diagram of experimental maze 


animals showed a slight preference for the left turn. When the two days are taken together, 
each group made 13 out of 20 possible first turns to the left, but the tendency to alternate on 
successive trials on the same day was so great that of the 40 free trials for each group, each had 
21 turns to the left. Thus only one animal of each group was forced twice to the same side on a 
given day, and then for only one of the two days. 


Learning Series 


Days 4 to 11, Thirst Motivation.—During this series the animals were trained under approxi- 
mately 23 hours of water deprivation and were satiated for food by having the regular food 
present in the home cages at all times. At least apparent satiation was accomplished by this 
procedure as none of the animals of the F Group consumed food while in the maze. During this 
training series the left-hand goal box was fitted with a water bottle similar to those from which 
the animals had drunk ordinarily in the home cages. A small quantity of water was poured on 
the floor of the box to equalize this factor for all animals; since if this was not done, a progressively 
larger drip spot might have favored animals run later in the day. Each animal had five trials 
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per day, two free and three forced. This technique assured a number of trials to the right or long 
path, permitted forced trials at the same time to the short side, and it was hoped would minimize 
the possibility of setting up a connection between being forced and going the long way around. 
No retracing was permitted, and the animals reached the water on every trial. Each was per- 
mitted to drink for about five sec. on each trial, and the five daily trials were separated by a short 
pause while the gates were reopened. After the five trials, each animal was placed in a transition 
cage which was supplied with water and was allowed to remain there for 20 min. after which he 
was returned to the home cage where he was again under a condition of water deprivation. 

The ten animals of the O Group were run first each day and the goal box on the right-hand 
side was empty for this group. For the F Group the right-hand goal box was supplied with a 
large quantity of their regular food, medium-large Purina dog pellets. The quantity was suffi- 
cient so that the animals had to run over the food or kick the pellets out of the way to pass through 
the box. The pile was not sufficiently high, however, to prevent a young rat in a hurry from 


rounding the turn in considerable haste. The food box was removed, scrubbed and aired for 
about 20 hours for the next days runs. 


Test Series 


Days 12 to 20, Hunger Motivation.—When both groups of animals, taken together, had 
reached a criterion of two days in which all free trials were to the left, or short path to the water; 
and at the same time had had as many trials, both free and forced together, to the right or long 
path to the water as had been required to reach the criterion, the motivation was shifted to 
hunger. The water bottles were returned to the home cages and the food removed. Food was 
placed in the right-hand goal box for all animals. One trial per day was given each animal and 
he was permitted to eat in the goal box for 10 to 15 sec. In the first day or two of trials under 
hunger motivation, this required that the animals be left in the goal box for some time as they 
showed little disposition to eat. As training proceeded they ate progressively more readily. 
After each run the animal was placed again in a transition cage and allowed to eat for about 20 
min. and was then returned to the home cage. Two medium-large pellets were placed in the 
home cage for each animal, and these were consumed in each case within two to three hours. 
Training was continued for a total of nine days and was terminated when one of the animals 


fell ill. 


RESULTS 


The course of acquisition of the left, or short path, or water-going 
response may be seen in Table I. The F Group appears to have 
learned somewhat more rapidly. ‘The O Group made a mean of 2.8 
free turns to the right or long path while the F Group made a mean 
of 1.6. This difference yields a t-value indicating significance at the 
three percent level of confidence. If this difference represents any- 
thing other than a difference due to chance alone, it may be that the 
food in the right-hand goal box served very much as does a hurdle or 
other slight obstruction to facilitate learning in this group. ‘The data 
on the comparative number of correct choices may be seen in Fig. 2. 

It can also be seen from Table I that by the end of the eighth day 
the total number of right or longpath turns, both free and forced, ex- 
ceeded the total number of left turns required to learn the water- 
going response—that is, the total number of left turns at the end of 
the sixth day. Thus, if free and forced turns can be considered to be 
equivalent in value, the animals which had found food in the right- 
hand box might have been expected to ‘know’ that the food was there 
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and turn immediately to the right when the motivation was changed 
to hunger. 

If, on the other hand, one feels that forced trials are not equivalent 
to trials in which the animal is free to turn either way, it is well to 
note that at the end of training under thirst motivation, the O Group 
had turned left 132 times in 160 free choice trials, while the F Group 
had turned left 144 times in 160 free choice trials. There is some 
qualitative evidence that not only are free and forced trials quite 
different in respect to the behavior of the animals, but there might 


TABLE I 
LEARNING SERIES 
Acquisition of the left (water) alley response under thirst motivation 



































O Group 
Correct Total Total Accum, Accum. 
Day Free (Left) Left Right Left Right 
Responses Responses Responses Responses Responses 

I 11 23 27 23 27 
2 12 26 24 49 SI 
3 14 29 21 78 72 
4 17 30 20 108 92 
5 20 30 20 138 112 
6 18 30 20 168 132 

7 20 30 20 198 152. 
8 20 30 20 228 172 

F Group 
Correct Total Total Accum. Accum. 
Day Free (Left) Left Right Left Right 
Responses Responses Responses Responses Responses 

I 10 23 27 23 27 
2 16 28 22 51 49 
3 18 28 22 79 71 
4 20 30 20 109 gI 
5 20 30 20 139 III 
6 20 30 20 169 131 
7 20 30 20 199 151 
8 20 30 20 229 171 




















be a considerable difference between a forced trial to the long path 
and a forced trial to the short path. Early in learning the animals 
ran slowly and hesitated at the choice point on both free and forced 
trials. As learning progressed the animals ran more rapidly and on 
the free trials exhibited no hesitation at the choice point. If a trial 
was forced to the left or short side, one could notice no difference in the 
manner of performing from a correctly run free trial. If the block 
was placed in the left alley, behavior toward it varied. Some animals 
ran blindly into it for a few trials. Others approached more slowly 
and attempted moving or circumventing the block directly. Fre- 
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quently they would attempt to return to the starting alley, and as 
often attempted to escape from the maze through the wire at the 
top before passing through the curtain and running quickly around 
to the water. 

It is also worthy of note that during all training trials animals 
might spend considerable time at the choice point, but rarely hesi- 
tated in or explored any other part of the maze. This was true even 
of the animals of the F Group when passing through the right-hand 
goal box. Frequently they were running fast enough at this point 
that the heaped food was scattered noisily in the box. 











207 
iw 
uJ 
= 
= 
. E 
WY 
~ 167 
> 
_— a 
“oF 
r- 147 
<x 
a — 
— 
e 124 / O-----0 F GROUP 
oO / 
= 4 ; @——® 0 GROUP 

4 
! 
10 ~~ T T T T T ’ 


M 
Ww 


4 5 6 7 8 
DAYS 


Fic. 2. Acquisition of left (water) alley response under thirst motivation 


The results obtained when the motivation was changed to hunger 
may be seen in Fig. 3. On the first day all animals turned to the 
left and thus went the long way around to food. ‘Training was con- 
tinued beyond the critical first day and a total of nine days at one 
trial per day failed to produce any demonstrable difference in the 
performance of the two groups. 

However, qualitative differences were noted in their performances. 
On the first day of hunger motivation all animals ran more slowly 
and did not eat readily although they were under 23 hours of food 
deprivation and the experiment was run at their usual feeding time. 
Throughout the entire nine days the animals did not appear to eat as 








ee 3 


ee oe 











46 EDWARD L. WALKER 


PATH TURNS-FOOD 
mi 











= 
= O----O F GROUP 
2 
mn 7 @—-® 0 GROUP 
0- T T rT ! T T —_ 
| 2 3 4 5 6 7 8 9 
DAYS 


Fic. 3. Acquisition of right (food) alley response under hunger motivation 


readily as they had learned to drink under thirst motivated training, 
and most of their time in the food-goal box was spent in trying the 
gates and attempting to escape through the screen wire roof where 
they were always removed from the maze. 


DISCUSSION 


The results of the present experiment agree in general with those 

of Spence and Lippitt as far as quantitative results are concerned. 

‘hat is, in both experiments the animals continued to go to water 

even when hungry, and the food groups failed to show any superiority 
in the acquisition of the food-going habit. 

In the present experiment, however, the animals with food in the 
maze learned more rapidly than the O Group, and the difference in 
terms of mean errors approached statistical significance. The dif- 
ference between the two groups in this respect in the Spence-Lippitt 
experiment was smaller, favored the O Group rather than the F 
Group, and while a test of significance was not reported and sufficient 
data were not given to permit the making of such a test, it appears 
too small to be significant. This difference in the present experiment 
was unexpected and the design permits no interpretation of its effect 
upon the later test. 
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As in the Spence-Lippitt experiment, the results confirm the pre- 
dictions of the Hullian postulate system, the difference in design 
merely changing the formulation in minor detail. In the Spence- 
Lippitt design the excitatory potential Ej, is greater than Eyion 
because Nyignt, or number of reinforced trials to the right, is equal to 
zero as they were never reinforced. The Hullian prediction would 
be the same in the present instance since E;,.,, would be greater than 
Evignte because of the greater length of the right-hand path, and as a 
consequence the goal gradient hypothesis would be operant. 

The present results confirm, in so far as they are applicable, 
White’s failure to predict the continuation of the left turn to water 
when the animals are made hungry. It may be noted, however, that 
all of the terms of White’s postulates may not have been fulfilled. 

His first or perceptual-learning postulate is stated as follows: 
“When a particular piece of behavior in a particular situation is once 
perceived as a path to a particular object, a more or less permanent 
‘knowledge’ of this relationship usually results. As a rule this knowl- 
edge 1s available to the organism when the same situation is again 
presented.” 

The second or path-goal postulate is: “Jf there 1s a motive or ‘need,’ 
the goal of which 1s a particular object, and if at the same time there 1s 
available the knowledge that a particular piece of behavior is a path to 
that goal-object, then the behavior will tend to occur.” 

White’s prediction of the right turn when hungry for the F Group 
assumes that all of the conditions of the postulates would be met by 
the experimental design and procedure. 

The path-goal postulate requires both a ‘need’ and ‘knowledge.’ 
Since iahite permits the inference of the ‘need’ for food from the 
fact that the animal has been kept from eating for 23 hours, the 
failure of the animals to enter the right-hand alley when hungry seems 
to require the interpretation that learning had not taken place, or 
that the animals had not acquired the requisite ‘knowledge.’ 

White says that we can tentatively infer that the organism pos- 
sesses such ‘knowledge’ if we know that it has often seen the food on 
the right-hand side “‘under conditions favorable to this type of per- 
ceptual organization.” In both the Spence-Lippitt experiment and 
the present one, the animals almost undoubtedly ‘saw’ the food in the 
maze; therefore we must conclude that they saw it under conditions 
which were not “favorable to this type of perceptual organization.” 
Thus it would seem necessary that the characteristics of ‘favorable’ 
and ‘unfavorable’ conditions be stated. 

Thus, these results seem to indicate the need for further experi- 
mental analysis. The role of forced trials under these experimental 
conditions is at the very least ambiguous. Despite procedure which 
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seemed to insure that the animals of the F Group could not avoid 
experience with the food, their behavior in its presence differed almost 
not at all from the animals of the O Group in the empty box. 

Lastly, the results seem to admit of the interpretation that the 
animals learned nothing about the presence of food in the maze when 
they were trained under strong thirst motivation. In the Spence- 
Lippitt experiment the display of such ‘knowledge’ would have re- 
quired that the animal make a response that was competing with 
and incompatible to the behavior required to obtain the water. The 
design of the present experiment attempted to reduce the incompati- 
bility of the two responses, at least during the original training, by 
rewarding the animal on every trial. Yet in the test situation, the 
animal, in order to make a correct response, was required to make 
what had been the longer and less desirable turn to the water. It 
remains for experimental demonstration that the animal cannot learn 
a response to some sign for the unwanted food while under strong 
thirst when both the food-getting and the non-food-getting responses 
have previously led to water. 


SUMMARY 


The present investigation attempted to verify the findings of 
Spence and Lippitt that rats trained under one strong drive failed 
to demonstrate any evidence of learning a response to an inappropri- 
ate goal object although the number of experiences with that object 
was, by the criterion of mere frequency, sufficient for learning. 

Twenty naive, male albino rats were given eight days of training 
under 23 hours of water deprivation in a rectangular maze in which 
the left turn led to a short path to water and the right turn led to a 
long path to water through an additional goal box. For half the ani- 
mals, the O Group, the second box was empty. For the other half, 
the F Group, the box contained food. The F Group learned the water- 
going response more rapidly than the O Group to a degree approach- 
ing statistical significance. When the motivation was changed to 
hunger all animals continued to make the turn to water and the F 
Group demonstrated no superiority in learning the food-going re- 
sponse. ‘These results are in general agreement with those of Spence 
and Lippitt. 

The nature of some of the ambiguities of the present design are 
pointed out and the course of further experimental work is suggested. 


(Manuscript received March 14, 1947) 
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PROBLEM SOLUTION BY MONKEYS FOLLOWING 
BILATERAL REMOVAL OF THE PREFRONTAL AREAS: 
VI. PERFORMANCE ON TESTS REQUIRING 
CONTRADICTORY REACTIONS TO SIMILAR 
AND TO IDENTICAL STIMULI 


BY PAUL SETTLAGE, MYRA ZABLE, AND HARRY F. HARLOW 


University of Wisconsin * 


INTRODUCTION 


arlier papers in this series have reported on the performance of 
rhesus monkeys with bilateral prefrontal lobectomy on a variety of 
tests (4, 10, 5, 6, 2). These previous studies showed that, though 
the brain injured animals in general performed less well on most of 
the tests used, they did in some instances compare very favorably 
to normal animals on difficult problems. The general impression 
gained from animal studies, as well as from reports on some human 
cases, notably those of Acerley (1) and Hebb and Penfield (7), is that 
bilateral prefrontal amputation does not produce anything approach- 
ing the crippling effects which might naively be anticipated. An 
experienced observer would have difficulty differentiating the brain- 
injured from normal monkeys, providing that areas related to over- 
activity (9) had been spared and providing that ample time had been 
allowed for healing. It is believed that statistical treatment of re- 
sults of careful and extensive testing would differentiate the animals, 
but simple observation and casual testing would not suffice. 

The situation suggests that some important effects of prefrontal 
injury have been overlooked, or that the tests used have not been ade- 
quate to disclose such effects. 

Since the original studies by Jacobsen (8) the literature has indi- 
cated that prefrontal monkeys have more difficulty in the solution of 
the delayed reaction test than any other test. In an earlier paper 
of the present series (10, p. 430) it was pointed out that a significant 
feature of the delayed reaction test is that of the ‘reversals’ frequently 
occurring during successive trials. 

The present investigation was undertaken to exploit the concept 
of reversal as a function dependent on integrity of the frontal associa- 


*From the Department of Anatomy and the Department of Psychology. This research 
was supported in part by grants from the Special Research Committee of the University o! 
Wisconsin for 1944-46. 
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tion areas. More specifically, the experiment was designed to test 
the hypothesis that problems involving contradictory reactions to 
similar or identical stimuli are uniquely difficult for prefrontal ani- 
mals.!. The performance of monkeys on such problems was studied 
by presenting in special sequences tests which individually could be 
easily solved by the animals. 


APPARATUS AND METHOD 


An extensive test series, requiring a minimum of go testing sessions and 3900 test trials for 
‘ach animal, was used to compare unoperated and prefrontal monkeys. 

The testing apparatus is illustrated in Figs. ta-1d. It consisted simply of a panel with two 

hallow depressions, serving as food-wells over which the test objects were placed. In testing, 
the animal’s view of the apparatus was screened off while a food-well was baited and the test 
‘bjects and panel were placed in position. ‘Then, after a screen had been lowered in front of the 
experimenter to conceal him from the animal, the screen in front of the test cage was elevated to 
permit a choice. Food was obtained if the correct object was chosen.? 

Two types of test problems were used. In the one type, designated as position-discrimina- 
tion, a series of trials was conducted in which food was always placed in the food-well on the same 
side of the panel. The animal’s problem was to discover on which side food was being placed, 
and to maintain choices on this side regardless of which test objects were present, and regardless 
of the fact that the test objects shifted from side to side in the same kind of irregular order as in 
the second type of problem. 

The second type of problem, designated as object-discrimination, was one in which food was 
always placed unc r the same one of a given pair of objects, and the positions of the objects were 
switched ina;  —etermined, irregular order. 

The tests were grouped into three series, A, B, and C. Series B demanded more frequent 
shifts in mode of problem solution than A, and C more than B. 


Series A 

This series comprised 30 tests, each using a different pair of stimulus objects. Only one test 
was given on one day, and 25 trials were administered.2 Whenever an animal failed to make 20 
correct responses in 25 trials, the same test was repeated the next day. It was arbitrarily decided 
that if any animal failed to reach the criterion of 20 correct responses in 25 trials in four suc- 
cessive days, it would graduate to the next test in the series in spite of such failure. 

Fifteen of the tests in this series were position-discrimination and fifteen were object-discrimi- 
nation tests. ‘The order in which they were given is indicated in the following series of symbols, 
in which O represents an object-discrimination test and L and R represent position-discrimination 
tests. L signifies that the left food well contained food; R signifies that the right food well con- 


tained food: OLLOOROLOROROLROOLOROOLOLROORL. 
Series B 


This series comprised go tests each using a different pair of stimulus objects. Three tests 
were administered each day, and the number of trials was limited to 15 per test, giving a daily 
total of 45 trials. 





1 The expression ‘prefrontal monkeys’ will be used in this paper to refer to monkeys with 
bilateral destruction of the prefrontal areas or, as they_have also been called, the frontal associa- 
tion areas. 

* Predetermined random order sequences and the various precautions and controls used in 
this study followed the standard practices of this laboratory, and have been described in previous 
publications. Fuller discussion of these is omitted in the present article, except as they apply 
specifically to this study. 

3A nor correction method was used throughout save for tests 1-10 of Series A in which a 
re-run method was used and a maximum of five re-runs allowed for each trial. An initial false 
response scored the trial as an error but not more than one error was scored for any trial. 
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lorty-five of the tests were object-discrimination and 45 were position-discrimination. The 
right position was correct in 23 of the latter and the left was correct in 22. A chance arrange 


ment of the object-discrimination, right position, and left position tests was used on any single 
test day, with the proviso that no one of them occurred more than twice in one day. 


Series C 

‘This series comprised 120 tests, but the same pair of stimulus objects, namely a funnel and 
alt cellar, was used for all. Four tests, limited to 15 trials each, were administered each day 
[In other words, the same pair of stimulus objects served as a means of (1) administering a Jef; 
position test for 1§ trials, (2) administering a right-position test for 15 trials, (3) administering 
an .d-object (salt-cellar rewarded) test for 15 trials, and (4) administering a B-object (funne'! 


TABLE I 


ConpITIONS FoR Series A, B, ann C 








Series A Series B Series C 

Number of tests 30 90 120 
Number of tests per day I } 4 
Number of object-discrimination tests i< 45 60 
Number of position-discrimination tests 15 45 60 
Number of trials per test 25-10 15 15 
Number of trials per day 25 45 60 
Criterion of success | 20 correct in 25 none none 
Number of days testing 30 30 30 
Stimulus objects Ditferent pair Different pair | Same pair 

for each test for each test for all tests 





rewarded) test for 1§ trials. This means that the animal found it necessary, at the beginning of! 
each series of 15 trials, to discover which test was current. There was no break in the testing 
nor any indication of an impending change at the time of transition from one test to another. 

It is to be noted that the order in which these four tests was conducted on any one day was 
randomized, with the provisions that the same order did not occur on two successive days and 
that the test used !ast on one day was not used as the first on the next day. 

For the convenience of the reader, the conditions applying to each of the three series of test: 
is here summarized in outline form in Table I. 


ANIMALS 


live operated and five normal monkeys were used as Ss in this experiment. The operative 
data and previous test experience of the five operated animals, numbers 22, 54, 55, 57, and 64 





ta. Correct choice of ‘A’ stimulus on two successive trials of an object-quality discriminatio: 
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1b. Correct choice of ‘B’ stimulus on two successive trials of an object-quality discrimination 





positional discrimination 


id. Correct choice of the L-position on two successive trials of 


Responses by animal 83 illustrating correct choices on two successive trials of the four 


y in Part III 


discrimination problems tested in a wage day 
v four photographs. ) 


Fic. 1. 


(The actual pictures are arranged from o: 
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have been described in a previous publication (2). Animal number 22 died before the completion 
of the present experiment. ‘The test history of the normal monkeys, numbers 67, 68, 83, 8: 
ind 86, has also been given in earlier papers (3, 11). 


RESULTS 


A. Comparison of normal and prefrontal monkeys in terms of 
number of errors in test performance 
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Errors made on the first 25 trials of the 30 problems of part A by the normal and prefronta' 
monkeys 
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Errors made on the 15 trials of the 90 problems of part B by the normal and prefrontal 
monkeys 





The monkeys with bilateral prefrontal lobectomy were consis- 
tently poorer performers than the normal animals. In order to illus- 
trate this, the data of the two groups have been compared in three 
ways: 

1. Figs. 2-5 show the number (precent) of problems in each test 
sequence which were solved without any error, the number solved 
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Errors made on the 15 trials of the 120 problems of part C by the normal and prefrontal 
monkeys 


with one or less errors, the number solved with two or less errors, etc. 
It is apparent from these curves that the prefrontal animals solved 
fewer problems with few errors, and made a larger numbers of errors 
on a larger percent of the problems than did the normal animals. 

2. Table II compares the mean error scores on successive blocks 
of 30 tests for Series A and B and successive blocks of 40 tests for 
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Errors made on trials 3-15 of the 120 problems of part C by the normal and prefrontal 
monkeys 





Series C. It is to be noted that the error scores of the operated 
monkeys, taken for individual animals or for the group are in all cases 
higher than for normal animals on the comparable block of tests. 
The difference between the grouped mean error score of the normal 
and operated monkeys is significant at the 0.1 percent confidence 
level in five of the comparisons, at the 1.0 percent confidence level in 
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TABLE II 


Mean Error Scores oF NORMAL AND PREFRONTAL MONKEYS ON 
Successive Groups or TEsTs 


























Normal Monkeys Prefrontal Monkeys Pp 
ests Group G ercent 
ues — l Mean eon nee 
| 83 | 85 | 86 | 67 | 68 s4 | ss | 64 | 57 | 22 
Series A: | 
I-30 3-8 | 5.1 | 3-8 | 5.0 | 4.7 | 4.48 6.4 | 6.4 | 7.9 | 8.2 | 8.1 | 7.40 1.0 
Series B: 
31-60 3-1 | 3.4 | 3-8 | 3.0 | 3.1 | 3.28 | 4.8 | 4.6] 4.9 | 5.0 4.82 O.1 
61-90 2.0 | 2.4 |] 3-4 | 2.4 |] 2.2 | 2.48 | 4.0 | 3.5 | 4.3 | 5.1 4.22 2.0 
gI-120 2.3 | 2.2 | 3.3. | 2.4 | 2.2 | 2.48 | 4.1 | 4.6 | 4.1 | 5.3 4.52 1.0 
gi-120 | 2.5 | 2.7 | 3.5 2.6 | 2.5 | 2.76 | 4.31 4.2 | 4.4 | 5.1 | 4.50 0.1 
Series C: | 
121-160 3-0 | 4.4 | 3.8 | 3.7 | 4.1 | 3-80 | §.5 | 5.4 6.4 5-90 0.1 
161-200 2.9 | 3-4] 2.9 | 3-4 | 3-2 | 3.16 | 5.7] 5.1 | 5.8 | 6.6 5.80 0.1 
201-240 3-1 | 4.8 | 3.2 | 4.2 | 3.8 | 3.82 | 6.2 | 5.2 | 6.0] 6.9 6.10 1.0 
121-240 3-0 | 4.2 | 3.3 | 3.8 | 3.7 | 3.60 | 5.8 | 5.2 | 6.0 | 6.6 5-97 O.I 






































three, and at the 2.0 percent confidence level in one of the compari- 
sons. 

3. The average error acores for the normal and operated monkeys 
differed little in the first 5 trials of the grouped data for the three series 
of tests. Marked differences between the two groups appeared in 
later trials of the tests, and these differences increased as the number 
of trials continued. These data are presented in Table III, which 
shows that the ratio of error scores between normal and operated 
animals for trials 1-5 is about 1: 1.1. In trials beyond the fifth, the 
ratio ranges from I: 2.0 to 1: 3.5. The tendency for the ratio to in- 
crease in the later trials of the problems is more pronounced in Series 
B and C than in Series A. 

Excluding trials 1-5, the differences between error scores made by 
the normal and operated animals are significant at the 0.1 percent 
confidence level for all three series of tests. Though the differences 
between the two groups as shown in Table II are statistically signifi- 
cant at a high confidence level, an enhancement of the statistical 
significance is brought about (Table III) by eliminating the first five 
trials of the tests from the calculations. 

The data as thus far presented show that the essential error score 
differences between normal and prefrontal monkeys occurred in the 
later rather than the earlier trials of the tests. The trend of the error 
scores is further defined in the next section. 


B. Comparison of the patterning of error scores of normal and 
prefrontal’monkeys. 
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TABLE III 


ERROR SCORES ON TRIALS I-5 ComPARED TO Error Scores on Later TRIALS OF THE TESTS 











Normal Monkeys Prefrontal Monkeys 





83 | 85 | 86 | 67| 68| M | sq|ss|s7] 64] 22| M | 22M 





Sesine A (Tests 1-30) 


Trials 1-5 52| 67| 541 57) 58] 58.0] 58] 66) 72) 66) 72] 67.0] 1.1 

Trials 6-15 37| 47| 41] 35] 411 44.2] 78) 79] 91] 94) 91 | 86.6] 2.0 

Trials 16-25 24| 31) 211 42] 43] 32.0] 56] So|} 81) 80) 70] 67.4] 2.1 
Series B (Tests 31-120) 

Trials 1-5 170} 158] 150] 142] 144] 152.8 | 175) 181] 193] 179 182.0] 1.2 

Trials 6-10 89] 30} 62] So} 54| 58.8] 141] 105| 142] 114 126.3| 2.2 

Trials 11-15 50] 21] 29] 27] 19} 29.2] 98) 68] 133] 101 100.0} 3.5 
Series C (Tests 121-240) 

Trials 1-5 283] 274) 288] 272] 280} 279.4 | 283] 284] 325] 305 299.2 1.1 

Trials 6-10 75| 781138] 84] 110] 97.0] 215] 180) 238) 210 210.8} 2.2 

Trials 11-15 47| 32) 66) Sol 54] 49.8] 191] 160] 189] 165 176.2] 3.5 












































The choice-response patterns which would permit solution (ob- 
taining of food reward) of the 240 tests of this study can be classified 
into four categories: 


(1) ‘A’ choice-response pattern—consistent selection of the ‘A’ 
object (regardless of its position to right or left). 

(2) ‘B’ choice-response pattern—consistent selection of the ‘B’ 
object (regardless of its position to right or left). 

(3) ‘R’ choice-response pattern—consistent selection of the ob- 
ject on the right (regardless of whether it is the ‘A’ or the 
‘B’ object). 

(4) ‘L’ choice-response pattern—consistent selection of the object 
on the left (regardless of whether it is the ‘A’ or the ‘B’ ob- 
ject). 


For purposes of discussion, the tests may be similarly classified 
as ‘A,’ “B,’ “R, “L, tests. 

The data of all of the tests were analyzed in terms of each of the 
choice-response patterns, to determine whether, on a given test, the 
monkey was responding ‘as though’ one of the other tests was being 
administered. For example, the choices on an ‘A’ test were examined 
to ascertain how many would have been correct for a ‘B’ an ‘R’ or an 
‘L’ test. Each instance in which 10 out of 11 (B and C Series) or 
20 out of 25 (A Series) successive choices followed one of the inap- 
propriate choice-response patterns was tabulated, since such consist- 
ency would occur by chance less often than one in a hundred times. 
The number of such erroneous choice-response patterns is given in 
Table IV, compilations being based on the first 25 trials of the A 
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Series of tests and on the first 11 trials of each test of the B and C 
Series. 


TABLE IV 
NuMBER OF ERRONEOUS CHoICE-RESPONSE PATTERNS 
Normal Prefrontal! 
Monkeys Monkeys 
Tests 1-30 (A Series) I 19 
Tests 31-120 (B Series) 24 51 
Tests 121-240 (C Series) 48 57 


It will be noted in the above table that consistent though inap- 
propriate choice-response patterns occurred in both groups of animals. 
The increased incidence of such patterns for the operated monkeys is 
significant at the one percent confidence level for the A Series of 
tests; the differences in the B and C Series are not statistically signifi- 
cant. 

As stated previously the A Series of tests was conducted by re- 
peating a test for a maximum of four days if an animal failed to 
reach the criterion of 20 correct choices in the 25 trials of a day’s 
testing. Analysis of each block of 25 trials which was administered 
in tests 11-30 of the A Series ‘ shows that only one instance of errone- 
ous choice-response patterns occurred among the normal animals, 
whereas 49 instances occurred in the operated group. ‘Table V has 
been prepared to show the number and distribution of the various 
choice-response patterns for the individual operated monkeys. 

Keeping in mind that persistent inappropriate choices, consistent 
with one of the four possible solutions, were made for as many as 20 
out of 25 trials by only one normal monkey on only one occasion, a 
glance at Table V shows the marked contrast between the two groups 
of monkeys. Every operated animal manifested false persistencies 
in a substantial number of occasions. The striking character of the 
difference is further illustrated by the fact that on two occasions 
operated animals persisted on the same inappropriate choice-response 
pattern for four successive days (100 trials), at the end of which time 
the problem was changed. There are three additional instances in 
which a given erroneous choice-response pattern was followed for 
three successive days (75 trials). This failure to relinquish an inap- 
propriate mode of response impressed the authors as one of the most 
significant observations of the present study. In the face of ‘over- 
whelming evidence’ as to its current inadequacy, a previously useful 
choice-response pattern was maintained as though alternatives did 
not exist. 


‘Tests 1-10 were conducted by the re-run method, as pointed out in footnote 3. Since the 
use of this method provides data which are not strictly comparable to those obtained by the non- 
correction method (tests 11-30) only tests 11-30 were used in this phase of the analysis. 
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TABLE V 


ErrRoNEOuS CHOoIcE-RESPONSE PATTERNS FOR OPERATED ANIMALS IN 
Tests 11-30 oF Series A 
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The symbols A, B, R, and L represent the four solutions: A object correct, B object correct, 
Right position correct, and Left position correct. The presence of any of these symbols on the 
row for an animal means that 20 or more of 25 choices followed a pattern which would have 
been correct for that test, but which was incorrect for the test actually being administered at 


the time. The presence of a dash means that 25 trials were conducted (25 choices made) with 
no discoverable, statistically significant false pattern. 


C. Motivation and attention 


The operated monkeys were carefully observed for excessive ran- 
dom activity, circling movements, orientation toward and attention to 
the test situation, and quality of motivation since information about 
these items is necessary for adequate assessment of results. 

Extensive injury of the prefrontal areas of monkeys commonly 
results in over-activity. All of our Ss exhibited some degree of rest- 
lessness and pacing for at least brief periods. "Two of them (numbers 
57 and 22) showed marked overactivity, comparable to that demon- 
strated by Ruch and Shenkin (9) following destruction of a special 
region of the orbital surface. For 4-5 months following operation 
it seemed inconceivable that these two monkeys would ever be suit- 
able experimental animals but they did eventually re-adapt to the 
test situation. Surprisingly enough, though the monkeys always 
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remained somewhat overactive under ordinary circumstances, testing 
came to have a quieting influence. 

Instances in which random and circling movements were of such 
intensity that they interfered with testing were recorded twice for 
monkey 54, never for monkey 55, twice for monkey 64, four times for 
monkey 22, and 13 times for monkey 57. No such difficulty was 
ever encountered in any test following the 2oth in Series A. 

‘The test results as already presented contain intrinsic evidence 
that the operated animals were well motivated. In Test Series C 
the daily sessions comprised 60 continuous trials and no unusua! 
measures were necessary to complete them. Number 22 twice re- 
fused to complete a test session and number 64 refused once. Both 
refusals to continue came after a long series of error responses in 
Series A tests. Normal monkeys will also refuse to test following 
a long series of unrewarded responses. 

I’vidences of attention were recorded by the notation VTE (vica- 
rious trial and error) and SC (self correction). WIE was noted when 
there were head or eye responses, or both in which first one and then 
the other object was scrutinized. SC indicated inhibition after reach- 
ing for and even touching or slightly displacing a test object. VTE 
and SC occurred frequently, as shown in Table VI. 


TABLE VI 
VTE anv SC Recorpep For OperatTep Monkeys IN TESTS I-30 
Animals VTE sc 
54 66 81 
55 26 186 
64 106 204 
22 132 123 
57 120 197 


‘These records are presented as unimpeachable evidence of good 
attention to the stimuli. If further evidence is needed, it resides in 
the fact that VTE and SC occurred more frequently on tests which 
were causing trouble rather than on tests which were solved with 
few or no errors. Finally, the fact that the prefrontal monkeys 
continued to make wrong selection according to a definite though 
inappropriate choice-pattern is in itself evidence that choices were 
not random. 


DISCUSSION 


The data throughout the present study show statistically signif- 
cant differences between the behavior of normal monkeys and mon- 
keys with extensive, bilateral lesions of the prefrontal areas.° These 


SIt is notable that consistent differences, without overlapping of scores, occurred in this 
study, which included two operated animals that had previously been found to do as well as some 
normal animals on the delayed reaction problem. 
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differences appeared in a test situation so designed that successful 
performance depended on alternating employment of antagonistic ° 
choice-response patterns. 

It was found that performance on the first three to five trials of the 
tests did not differentiate the operated from the normal monkeys. 
The differences appeared in the later trials and became more pro- 
nounced as the trials proceeded. 

The patterning of error scores showed that in one series (the A- 
Series) of tests the operated animals tended to persist in choice- 
response patterns which, currently inadequate for the solution of a 
problem, had previously served to yield the food reward. ‘The tend- 
ency was virtually absent among the normal animals in the same 
test series. "Though both groups of Ss showed the tendency in the 
other series (B and C), the differences between the groups were not 
significant. 

The authors offer the following interpretations of these findings. 

First, it is asserted that the differences observed between the 
normal and operated monkeys in this study are not attributable to 
difficulty that the individual tests, taken in and of themselves and 
administered independently, would have offered the operated animals. 
Other studies of this present series have demonstrated that pre- 
frontal animals compare quite favorably with normal animals on 
tests of the type used in this investigation (4). The differences which 
have been found must therefore be explicable in terms of the inti- 
mate juxtaposition of the tests in the series. 

Second, the clear cut evidence of the persistence of erroneous 
choice-response patterns, as in Series A of the present study, is in- 
terpreted to mean that is is something which is carried over as the 
result of work on earlier tests in the series which interferes with per- 
formance on later tests. ‘his postulated phenomenon may be de- 
signated as perseverative interference. 

Third, since perseverative interference became manifest in a 
situation designed to emphasize antagonistic choice-response patterns, 
it is deduced that the phenomena of perseverative interference will 
be encountered when successful adaptation requires that a pre- 
established reaction set’ be suppressed or modified. That is, per- 


6 The term antagonistic is here used to refer to the fact that outwardly similar situations 
would require, at one time, choice of the object on the left and at another time, choice of a similar 
or even identical object on the right for attainment of the food reward. The only differentiating 
cue was the relation of the food reward to objects or to position on the preceding trials. 

7 Admittedly, the expression ‘reaction set’ is vague in the context used. This is intentional, 
for the nature of the interfering, perseverative factor cannot be clearly defined at present. It is 
some occurring state or activity of the animal which at the very least must have a complex 
underlying neural basis. It is also highly probable that none of the psychological categories 
commonly used would have a close correspondence to the factor under consideration. 
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severative interference will occur when a situation is ‘calculated’ to 
arouse pertinent but inadequate and interfering reaction sets. The 
arousal of these relics of earlier experience would then be expected 
to hinder performance more than would be the case if the problem had 
less relation to earlier experience and earlier modes of adjustment. 

‘The hypothesis of perseverative interference as outlined above 
commends itself for the revaluation of data obtained in earlier human 
and animal studies. It has already been mentioned that considera- 
tion of the ‘reversal’ feature of successive delayed reaction trials was 
partially résponsible for the development of the present series of tests. 
The reversal feature of delayed reaction has an obvious relation to 
perseverative interference. Whereas performance of human Ss has 
been painstakingly evaluated in terms of learning ability and memory 
in general, the hypothesis of perseverative interference would direct 
the search for possible impairment in these functions to realms where 
interference between competing reaction sets might be anticipated. 


SUMMARY AND CONCLUSIONS 


1. A series of tests was designed to compare the performance of 
five normal and five prefrontal monkeys on problems involving con- 
tradictory reactions to similar or identical stimuli. 

2. The tests differentiated clearly between the two groups of 
animals. ‘The performance of the prefrontal animals was consis- 
tently and significantly inferior to that of the normal group. Error 
scores averaged for blocks of 30 or 40 tests for each animal showed 
no overlap between the groups. The best score of each operated animal 
was in each instance poorer than the poorest comparable score of any 
normal animal. Evidence was presented to show that the operated 
animals were well motivated. 

3. Two of the operated animals in this series were previously 
compared with normal monkeys in delayed reaction studies, and were 
found to perform as well as some normal animals. 

4. Statistical manipulation showed that it was possible to assign 
the occurrence of some of the wrong choices to the persistence of 
erroneous choice-response patterns in both groups of animals. There 
were no significant differences between the two groups in test series 
B and C, but the earlier series A showed a virtual absence of the 
tendency in normal animals as contrasted with a strongly manifested 
tendency in the operated group. The erroneous choice-response 
tendencies sometimes persisted for as long as 100 trials on the same test. 

5. The inferior performance of the operated animals was inter- 
preted as resulting largely from perseverative interference, i.e., from 
impairment of the ability to relinquish previously acquired, inter- 
fering reaction patterns. 
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6. It was suggested that performance of human and animal Ss 


with injury to the prefrontal areas may profitably be re-evaluated 
in terms of the hypothesis of perseverative interference. 


(Manuscript received March 14, 1947) 
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EXPERIMENTALLY ACQUIRED DRIVES 


BY MARK A. MAY! 


Yale University 


In his studies of wants, interests, and attitudes Thorndike (7) 
reports a number of experiments which confirm the common belief 
that human motives can be acquired and modified by learning. ‘The 
results of our experiments support the conclusion that a person can 
be taught new attitudes and tastes as surely, though not as easily, as 
he can be taught facts or skills. ‘The basic principles of learning by 
repetition and reward seem to operate with wants, interests, and at- 
titudes as they do with ideas and movements”’ (7, p. 189). Thorn- 
dike regards these acquired forms of motivation as responses to situa- 
tions. ‘They are unlike other responses, however, in that they exert 
a determining influence on subsequent behavior: “Interests and 
motives cause certain connections to occur, and also cause certain 
after-effects of connections to be satisfying” (7, p. 47). In brief, 
they are habits with motivating and reinforcing power. 

The question arises as to what it is that endows them with these 
powers. Miller and Dollard (5) have advanced the hypothesis that 
the motivating power of a response (or pattern of responses) is a 
function of the strength of the stimulation that it produces. Drive 
is defined as ‘intensity of stimulation.’ A weak stimulus with a low 
drive potential may evoke a response which produced stronger stimu- 
lation with higher drive potentials. Such drives are called ‘second- 
ary’ (i.e., they are mediated by responses). If the connection be- 
tween the weak stimulus and the response is learned, the resulting 
drive is said to be acquired. 

Little is known about the nature of stimulus-producing power 
of different types of responses. Guthrie (2) has suggested that con- 
flicting responses have high motivating values. He regards desires 
as arising from inner conflicts. Visceral disturbances that accompany 
strong emotions are a class of responses which undoubtedly have high 
drive potentials. ‘Tensing of the skeletal muscles also gives rise to 
strong proprioceptive stimulation. In addition to the muscular and 
glandular responses which produce intense stimulation there is an- 
other class of reactions that should be included in order to account 
fully for the motivating responses of interests, wants, attitudes, and, 


1 The laboratory work on the experiments reported in this paper was done by Clayton Bishop, 
Nancy Hirosé, and Natalie Zinn. 
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to some extent, fears and anxieties. It is the neural reactions which 
precede the discharges into terminal motor pathways that evoke 
muscular reactions.’ 

Any of these responses can presumably become conditioned to any 
stimulus to which an organism is sensitive. ‘There is ample evidence 
that such conditioning does occur, particularly in the case of visceral 
and skeletal responses. ‘The fact, however, that many of these drive- 
producing responses are internal and unobservable raises difhcult 
technical problems of measurement. ‘There are, of course, many 
techniques for measuring visceral changes, either directly or indi- 
rectly, but there are no satisfactory measures of mass neural reactions. 
Until better direct techniques are discovered, indirect methods can 
be used. 


Miller (4, p. go) has devised a procedure for measuring indirectly the 
drive value of certain patterns of response-produced stimuli. It is based on 
the principle that drive reduction is a reinforcing agent in habit formation. 
Rats were shocked a number of times in a white compartment and permitted 
to escape through an open door into a black compartment. During this 
training the white surroundings presumably became conditioned to a drive- 
producing pattern of responses evoked by shock. The test of such condi- 
tioning is whether the animals will learn a new habit to escape from the white 
compartment. The next step in the experiment was to close the door be- 
tween the two compartments, but attach to it a wheel which, when turned 
by the rat, would release a mechanism opening the door. When the rats 
were put into the white compartment with the door closed, but not shocked, 
they promptly learned to turn the wheel. Miailler’s interpretation is that 
the activity which led to the wheel-turning was motivated by drive-pro- 
ducing responses evoked by the white compartment and further that the 
wheel-turning response was reinforced by the termination (or reduction) 
of these response-produced stimuli which occurred when the animal escaped 
into the black surroundings. 


PROCEDURE 


In the experiments reported in this paper still another technique was used for determining 
the acquired drive value of a stimulus. The general plan of the experiments is similar to that 
used in experiments on secondary stimulus-generalization by Lumsdaine (3), Shipley (5), and 
Graham (1). The experimental design may be diagrammed as follows: 


ie Se 
1. Shock———— RxSx------9R (escape) 
2. Shock————> RxSx (no escape response permitted ) 
Buzzer 7 
. Test-Buzzer----- +RxSx----- +R (escape)? 


The symbol “RxSx” stands for the pattern of drive-producing responses. The dotted 
arrows represent acquired connections. 


foe 








? There is, of course, a difference of opinion among psychologists as to whether or not it is 
necessary to postulate a class of nonmuscular neural responses. 
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First, rats were trained to cross a barrier in a shuttle box and escape shock. The box was 
the Miller-Mowrer demonstration apparatus. It is about 30 in. long, 6 in. wide, and 18 in. high 
with grid rods for the bottom. The barrier is a partition, four in. high, with a roller mounted 
on it to discourage rats from perching. Directly over the roller is a swinging door which can be 
left open or closed. ‘The grid is wired so that each end half can be charged separately. Without 
warning, each S was shocked until he crossed to the cold side of the grid. The normal intervai 
between trials was 60 sec. The strength of shock, except for one special experiment, varied 
around 1.0 ma, depending on the weight of the animal and the strength of shock necessary to 
evoke vigorous activity. It was found that shocks of low to moderate intensity gave the best 
results. The main purpose of this training was to condition the stimuli, produced by responses 
to shock, to crossing the barrier. 

The purpose of the second phase of the training was to condition the buzzer to drive-producing 
responses evoked by shock. Each rat, during these buzzer-shock trials, was confined in a pen 
in the middle of the shuttle box. The pen was six in. square and the height of the box. The 
buzzer was sounded for 10 sec., the last five of which were overlaid with shock. The intervals 
between trials varied from five to 120 sec. This variation was introduced to avoid conditioning 
the buzzer to any specific escape response. 

The third phase was the test trials. The Ss were placed in the shuttle box in front of the 
barrier and the buzzer sounded for 10 sec. or until the animal crossed. If he did not cross in 
10-sec., the trial was recorded as a failure. The interval between trials was 60 sec. 

The order of training and testing was as follows: First, each rat received five shock-crossing 
trials in the shuttle box, then five buzzer-shock trials in the shock pen, then five more shock- 
crossing trials. If, on each of the last five shock-crossing trials, he crossed in three sec. or less 
from onset of shock, the training was ended and he was given his test trials immediately. If the 
criterion was not met, his training was continued on the next and succeeding day. All animals 
that did not meet it in three days of training were discarded. This procedure was adopted after 
considerable exploratory work on the amount and distribution of shuttle-box training and buzzer- 
shock training in the shock pen. 

The control groups were trained and tested exactly as the experimental groups with the one 
exception that the buzzer was never paired with shock. Two groups got shock alone in the pen; 
one group got buzzer alone; another, shock or buzzer in a random order. 


RESULTS 
Group A 

All training and testing of this group was done with the door over 
the barrier closed. The experimental group consisted of 12 rats— 
nine albinos, three hooded, all males and from three to six months 
old. ‘Two animals reached the training criterion of crossing to shock 
in one day; eight in two days; two in three days. Each received 25 
successive test trials beginning immediately after the last period of 
training. 

In the control group of 12 rats seven were albinos, five hooded, all 
male and of the same average age as the experimental group. ‘Three 
required only one day of training; six, two days; and three, three 
days. In the shock pen they received shocks only. Each received 
25 test trials just as the experimental group. 


The test scores of each of the animals in the two groups were the 
following: 


Experimentals: 24, 24, 23, 21, 21, 20, 19, 19, 17, 16, II, 4. 
Total: 219 positive trials out of 300 (70%) 
Controls: 4, 4, 3, 3, 3, 3, 2, 2, I, 1, 1,0. 
Total: 27 positive trials out of 300 (9%) 
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The mean score of the experimental group was 18.25; of the control, 
2.25. The difference between these means is about I0 times its 
standard error. 


Group B 


For this group the door over the barrier was left open. ‘There 
were seven animals in the experimental group, all male albinos. All 
but one reached the training criterion in one day. ‘There were three 
control groups of seven rats each. One group received shock only; 
one, buzzer only; and one, buzzer or shock in random order in the 
shock pen. All but one completed the training in one day. All re- 
ceived 25 test trials immediately following training. The results were 
the following: 


Experimentals: 25, 24, 23, 22, 20, 19, 15. 
Total: 148 (84.5%). Mean: 21.1 
Control (shock only): 13, 12, 12, 7, 6, 6, 4. 


Total: 60 (34.2%). Mean: 8.6 
Control (buzzer or shock): 7, 6, 4, 3, 2, 2, 0. 

Total: 24 (13.7%). Mean: 3.4 
Control (buzzer only): 7.6, &, 1,0 6,0. 

Total: 19 (10.8%). Mean: 2.7 


The differences between the means of the experimental group and 
each of the control groups are all significant beyond six times their 
standard errors. 


FURTHER RESULTS 


Additional data obtained from these and other exploratory ex- 
periments support the main hypothesis that the superiority of the 
experimental groups over the control groups may be attributed to 
the greater drive value of the buzzer resulting from pairing it with 
shock. | 

1. Response latencies on test trials —In a recent experiment, to be 
reported later, it was found that the latency of jumping responses to 
shock is a function of the strength of the shock; and, further, that the 
latency of extinction trials is a function of the strength of the shock 
used on the training trials. But extinction trials have somewhat 
longer latencies than their corresponding training trials. This and 
other experimental evidence indicate that response latencies, gen- 
erally, are functions of the drive values of the instigating stimuli. 
If this is true and if it is also true that the buzzer gained increased 
drive value by pairing it with shock, it would be expected that on the 
test trials the latencies of the experimental groups would be signifi- 
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cantly shorter than those of the controls, but the differences between 
experimentals and controls on the training trials would not be signifi- 
cant. ‘They would, however, be expected to be shorter for all groups 
than on the corresponding test trials. 

In these experiments all latencies were measured with a stop 
watch, and recorded to the nearest second. While such rough meas- 
ures are subject to errors of fractions of a second, such errors are 
probably distributed in the same way in all groups. Two latency 
scores were used for each rat—one, the mean of his last nine training 
trials; the other, the mean of the first 10 positive test trials. In the 
control groups, where fewer than ten test trials were positive, all 
trials were used. Comparisons of the means of these distributions 
tend to confirm the above predictions. ‘The mean latencies of the 
test trials of each group are significantly longer than those of the 
corresponding training trials. ‘The means of the experimental groups 
for the test trials are shorter than those of the control groups. The 
difference between the means of the experimentals and controls in 
Group A is 2.7 times its standard error; and for B-groups it is 2.3 
times its standard error. On the training trials the mean of the con- 
trols is about .5 sec. shorter than that of the corresponding experi- 
mental group, but the difference is only 1.7 times its standard error. 

2. Extinction of the buzzer.—The greater the drive value of a 
stimulus, the greater are the reinforcing effects of escape from it. If 
the buzzer had greater drive value for the experimentals than for the 
controls, it would be predicted that in the test trials, which are also 
extinction trials, the controls would show more rapid extinction effects 
than the experimentals. This proved true for Group A (door closed) 
but not for Group B (door open). ‘Total test scores of Experimenta! 
Group A dropped 1g percent from the first five to the last five of the 
25 test trials; the drop for the controls was 83 percent. In the B- 
Group the drop for the experimentals was 25 percent; for the shock- 
only controls, 21.4 percent; and for the other two control groups 
there was a slight increase. ‘Two of the experimental animals and 
two of the shock-only controls were given an additional 75 test trials, 
running the total trials foreach up to 100. ‘The controls extinguished 
at about the 35th trial, but neither of the experimentals had extin- 
guished at the rooth trial. All of the last five trials for one, and three 
of the five for the other, were positive. 

A special effort was made to extinguish one rat by tiring him out. 
He was given a buzzer test every 15 sec. for 20 trials, and then every 
five sec. for 10 trials. All responses were positive, but the latencies 
of the last 10 trials were longer. Then the buzzer was sounded con- 
tinuously for five min. The intervals between crossing and re- 
crossing until extinction occurred were, in sec., 3, 6, 15, 20, 30, 40. 
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60,120. An hour later he was retested and had fully recovered from 
the extinction. 

Five rats, trained according to the procedure of the first experi- 
mental group, were used in a special experiment to see if extinction 
could be produced by first reducing the intensity of the buzzer, then 
gradually increasing it to full strength. Four levels of intensity were 
obtained by muffling the buzzer with 0, 1, 2, or 3 layers of a pocket 
handkerchief. When wrapped in three layers, the buzzer was still 
audible to the experimenter at a distance of 10 feet. The rats were 
never more than two feet away. Each rat was first given Io test 
trials with the full buzzer. Seventy-five percent of the responses 
were positive. ‘Then a series of tests of five trials each was run be- 
ginning with the lowest intensity of the buzzer and working up to full 
buzzer. Complete extinction was obtained for all five of the rats. 
This extinction spread, as one would expect, to all other sound stimuli 
to which the animals would respond previous to extinction. ‘Two of 
the six rats were tested for spontaneous recovery after two hours. 
Both had fully recovered. 

3. Extinction of spontaneous crossing.—In the above experiments 
the swinging door over the barrier was never locked. ‘The rats could 
cross and recross at will. An indefinite number of crossings was 
possible during the training and testing trials. Sixty to go sec. were 
allowed between the last spontaneous crossing and the next training 
or testing trial. Thus, on the training trials an animal could avoid 
shock for a considerable time by crossing and recrossing at intervals 
of less than 60 sec. Ten of the 12 experimental animals (Group A 
door closed) and 11 of the controls learned this form of adaptive be- 
havior and used it fairly regularly both on the training and on the 
testing trials. Very few of the animals in the B-Group (door open), 
on the other hand, learned it.’ 

Considering only the results obtained from the A-Group the data 
show first, that on the test trials the behavior extinguishes much more 
rapidly for the controls than for the experimentals. This would seem 
to indicate that the buzzer was more annoying to the experimentals 
than to the controls and that they used the same adaptive behavior 
to avoid it as had been used to avoid shock. Second, in the case of 
the experimentals there is a high correlation (.95 by the rank differ- 
ence method) between total test score and total number of spontane- 
ous crossings during the test trials for the experimental group. The 
corresponding correlation in the case of the controls is .42. 

4. Primary stimulus-generalization.—lIf, according to the Miller- 
Dollard hypothesis, the buzzer acquired increased drive power for the 


? Why this form of adaptive behavior should be learned more readily when the door over the 
barrier was left closed than when open presents an interesting theoretical problem. 
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experimental groups by becoming conditioned to drive-producing re- 
sponses, it would be expected from the principle of primary stimulus- 
generalization that this conditioning would spread to other sound 
stimuli but not to visual or tactile stimuli. Such spread would not 
be expected to occur in the control groups. Therefore, if all groups 
were tested with sound stimuli other than the buzzer and with tactile 
stimuli, the experimental group should show higher scores than the 
control in the case of the sound stimuli but not in the case of the tac- 
tile stimull. 

A brief experiment was run to test these deductions with three 
experimental and six control animals. They were trained and tested 
with the door open. On the test trials three sound stimuli were used 
and two tactile stimuli. These were applied in a random order to 
equalize any generalization that might spread from one to another 
as a result of the test trials. The sound stimuli were: (1) the buzzer, 
(2) sharp taps on the glass front of the shuttle box (which was covered 
with a curtain during all test trials), (3) sharp clicks produced by a 
ratchet attached to a wheel at one end of the box; the tactile stimuli 
were: (1) touching the bottom of a foot of the animal with the blunt 
end of a wire manipulated from beneath the grid, (2) vibrations of 
the grid produced by the attachment of an electric vibrator. The 
percentagees of crossing responses of each of these two groups to these 
five stimuli are the following: 











Buzzer Taps Clicks Vibrations Wire 
Experimentals go 92 63 «54 100 
Controls 7 8 8 80 83 




















The differences in responses to the sound stimuli are statistically 
significant beyond three times their standard errors and confirm the 
prediction. ‘The difference in response to the vibrator is in favor of 
the control group, and in response to the wire is not statistically sig- 
nificant. These two tactile stimuli are undoubtedly generalizations 
from shock. 

5. Observational records.—The drive strength of a stimulus can be 
inferred in a rough qualitative way by observing the behavior of an 
animal in response to it. In these experiments observational records 
were kept of the reactions of all animals to the buzzer on the test 
trials, and to shock on the training trials. ‘Toward the end of the 
work we devised a crude rating scale for recording the apparent anxi- 
eties of the Ss. These observational records indicate three marked 
differences between the reactions of the experimentals and the con- 
trols to the buzzer. They are reported here with the warning that 
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they are subject to the usual biases and errors of all such observations: 
a) On the negative test trials the experimental animals seemed to be 
much more excited and agitated by the buzzer than the controls. 
The most characteristic initial response of the experimentals was 
startle, sudden jerking of the head, followed by running, turning 
around and around, attempting to climb out of the box, biting the 
grid, and often defecation. ‘The behavior of the controls appeared 
to be far less disturbed. (b) The buzzer had more power to break up 
whatever the animal happened to be doing in the case of the experi- 
mentals. In situations like this rats do a lot of face-washing and 
nail-biting. Such behavior was more often broken up by the buzzer 
in the case of the experimentals than in the case of the controls. (c) 
In a few of the experimental animals the buzzer evoked quite distinct 
reflexes—as, for example, ear movements in the case of some hooded 
rats, and a distinct quiver of the lower jaw in the case of some albinos. 
These responses would appear instantly and regularly at the sound 
of the buzzer and would disappear when the buzzer ceased. ‘They 
were not noticed in any of the control animals. 


DISCUSSION 


These results indicate quite clearly that the drive value of the 
buzzer was substantially increased by pairing it with shock. Ac- 
cording to the Miller-Dollard hypothesis, this gain is due to the fact 
that the buzzer became conditioned to responses that produced 
stimuli that have greater drive value than the buzzer alone but less 
than that of shock. The question for discussion is: what were these 
drive-producing responses to which the buzzer became conditioned? 

First, we may be quite certain that they are unconditioned re- 
sponses to shock which, for convenience, may be divided into three 
groups. (1) The observable responses of startle, jumping, climbing, 
running, biting the grid, squeaking, crouching, and so on. (2) The 
less observable inner responses of changes in rate of respiration, of 
heart action, digestion, and glandular secretions. (3) The wholly 
unobservable neural responses, particularly in the motor centers, 
which can, theoretically at least, become conditioned to external 
stimuli. Any one of these three types of responses could perhaps 
produce stimuli strong enough to account for our results. It is rea- 
sonable to conclude, however, that all of them may, in varying de- 
grees, become conditioned to the buzzer. 

Some light on which ones were most influential in evoking the 
response of crossing on the test trials may be had from the experi- 
ments. The assumption is that on the training trials the stimuli 
produced by some or all of the unconditioned responses to shock 
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became conditioned to the crossing responses. But is there any in- 
dication as to which of these types of unconditioned responses to 
shock produced the stimuli that acquired the greatest habit loadings 
with the crossing responses? ‘The evidence is meagre and incon- 
clusive but as far as it goes it points to the suggestion that the most 
influential group of mediating responses are neural and the least influ- 
ential are the skeletal muscular responses that produce proprioceptive 
stimuli. 

Consider first the evidence that proprioceptive cues played a 
minor role as mediators between the buzzer and crossing responses. 
Observational records of the behavior of the Ss in our experiments 
fail to show that any animal made any single consistent response, 
prior to crossing on the training trials, that was paralleled by the pre- 
crossing behavior evoked by the buzzer on the test trials. If, for 
example, an animal had regularly crouched before leaping the barrier 
on the training trials and if on the test trials the initial response to 
buzzer had been crouching, we then might have concluded that the 
proprioceptive cues produced by crouching were the stimuli that 
evoked crossing. But this was not the case. On the contrary, 
when the buzzer evoked behavior that appeared similar to that evoked 
by shock on the buzzer-shock trials, the animals failed to cross as often 
as when it evoked dissimilar behavior. ‘The observational records 
show that the pre-crossing behavior on the positive test trials was of 
three general types: (1) some rats would sit quietly at the end of the 
box and respond to the buzzer rather deliberately by running to the 
barrier and crossing; (2) others would approach the barrier, hesitate, 
back away, then approach again, and this might be repeated two or 
three times before crossing; (3) still others would sit poised near the 
barrier and leap over at the sound of the buzzer. It is possible, of 
course, that both shock and buzzer evoked a general tensing of the 
skeletal muscles unobservable to the experimenter, but even so the 
stimuli evoked by such general responses would be unlikely to have 
sufficient distinctiveness or cue value to permit conditioning to cross- 
ing in as few as 10 trials. 

An experimental check on this point was made by training three 
groups of animals to respond to the buzzer with actions similar to 
those that had been observed to’ preceding crossing on the shock- 
crossing trials. For example, as indicated above, some animals 
crossed by leaping the barrier. To simulate this behavior a group of 
seven rats was trained to jump in the air at the sound of the buzzer. 
This training was substituted for the buzzer-shock training that was 
given to the experimental group in Group B. ‘These animals were 
given 25 test trials each. Their scores were: 8, 7, 7, 5, 4, 4, 3—a total 
of 38 or 21.4 percent positive, with a mean score at 5.4. This per- 








EXPERIMENTALLY ACQUIRED DRIVES 


“I 
wi 


centage score is 60 percent less than that of the comparable experi- 
mental group and the mean score is 15.7 crossings less than the 
experimental group. Another group of six rats was trained and 
tested as were those in Group A, except for the buzzer-shock 
trials buzzer-shock-wheel-turning was substituted. The wheel was 
mounted at one end of the box about the same distance from the floor 
as the roller on the barrier. The responses involved in wheel-turning 
appear to be similar to the initial ones of climbing over the roller on 
on the barrier. The total score of these rats on 76 test trials was 46, 
or 60 percent, which is 10 percent less than that of the comparable 
group of experimental animals. Finally, and in order to simulate the 
behavior of running and crossing, and to see if such supplementary 
training would improve the scores of the,control groups in Group B, 
two of these groups were given 20-30 trials each of running from 
one end of the box to the other at the sound of the buzzer. If the rat 
did not run after five seconds of buzzer, he was shocked until he did 
run. The barrier was removed on these trials. Immediately fol- 
lowing this supplementary training each animal was retested on cross- 
ing the barrier. The ‘buzzer-only’ group improved its percentage 
score from 10.8 to 30. The ‘random-buzzer or shock’ group im- 
proved its percentage score from 13.7 to 52. ‘These improved scores 
are still significantly less than the score of the comparable experi- 
mental group, which is 84.5 percent. These experiments indicate 
that even though training to respond to the buzzer with behavior 
that is similar to that of crossing the barrier is given, such training 
does not produce scores that are superior to those of the two experi- 
mental groups. 

In regard to the role of visceral responses as producers of medi- 
ating cues, there is no indication from these experiments as to the 
part that they may have played. They may be responsible, in some 
part, for the sensitization effects which persist for several minutes, 
perhaps an hour or more after an animal has been shocked. These 
sensitization effects had a marked influence on the test responses of 
both the experimental and the control groups. In one experiment, 
not reported above, it was found that if the test trials were repeated 
after 24 hours, the positive responses to buzzer were reduced by 64 
percent for an experimental group; 77 percent for a shock-only con- 
trol group; and 80 percent for a buzzer-only control group. Further- 
more, it was found that when a rat was tested from four to 24 hours 
after his last shock and failed to respond to the buzzer, he would re- 
sume responding if given one or two shock-crossing trials, or if he 
could be induced to cross by prodding the bottom of a foot with the 
blunt end of a wire, or by vibrating the grid. This sensitization effect 
could be produced by the release in the bloodstream of chemicals in 
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response to shock, or stimuli that resemble shock, or by the rever- 
beration of neural circuits instigated by shock, or both. 

‘The main reason for believing that the most important mediating 
mechanism operating in these experiments is some pattern of neural 
activity in the motor areas of the central nervous system is that the 
one stimulus which, for all animals, was the most distinct and power- 
ful and most clearly and promptly conditioned to crossing was that 
of shock itself. Shock not only had high cue value in a situation but 
also had high drive value. Just why a strong stimulus should have 
a higher drive value than a weak one is unknown. Our guess is, 
however, that strong stimuli evoke more excitation (i.e., a greater 
volume of firing in the central nervous system) than weak ones. 
This firing is not wholly random, but probably patterned. In these 
experiments the pattern evoked by shock produced the escape move- 
ments involved in crossing the barrier. 

Such patterns may be regarded as motor phenomena and can be- 
come conditioned to stimuli. When a buzzer is paired with shock 
under conditions used in the above experiments, it could acquire 
power to produce a pattern similar to that produced by shock in two 
ways. First, it could become conditioned to the pattern directly; 
second, it could become conditioned to the muscular and glandular 
responses to shock, and the proprioceptive and interoceptive stimul! 
produced by these responses may become conditioned to it. Hence, 
on the test trials, when buzzer is sounded without shock, the pattern 
of neural firing produced is sufficiently similar to that evoked by shock 
to elicit the behavior of crossing the barrier. The volume of firing 
in this pattern, however, is presumably less than that evoked by 
shock, as indicated (1) by the longer latencies of the crossing response 
to buzzer than to shock, (2) by the fact that test scores are much 
higher when the animal is sensitized by recent shock. ‘This sensi- 
tization is believed to be, in large part at least, the persistence of the 
neural pattern after shock has ceased. Furthermore, any stimulus, 
like foot tickle or vibration of the grid which elicits a neural pattern 
similar to that produced by shock, will evoke the crossing response. 
An air blast will not evoke the response because it produces a different 
pattern of neural activity. Similarly, when the buzzer has become 
conditioned to the shock-pattern, and other sound stimulus that 
stands on a gradient of primary stimulus-generalization with the 
buzzer will also tend to produce the response, but less frequently and 
with greater latencies than the buzzer. 


SUMMARY 


The purpose of the experiments was to find out if the drive value 
of a buzzer could be increased by pairing it with shock. The experi- 
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ments were exploratory in the sense that they were not designed to 
determine a law or a function between strength of acquired drive and 
strength and number of shocks, but simply to probe further into the 
problem of secondary drives and how they are acquired. 

First, an escape response was set up in two groups of rats under 
conditions that prevented, insofar as possible, its conditioning to en- 
vironmental cues. Then, a buzzer was paired with shock under 
conditions that prevented the animals from making any consistent 
escape response. ‘The rats were then tested to see if they would make 
the conditioned escape response to the buzzer without being shocked. 
The test scores of one group were 70 percent positive, and of the other 
84.5 percent. 

The control groups were trained and tested in the same way except 
that in no case was the buzzer ever paired with shock. Instead of 
buzzer-shock trials two groups received shock only; one, buzzer 
only; and one, buzzer or shock in random order. ‘The percentages of 
positive test trials were 9.0 and 34.2 for shock-only, 10.8 for buzzer- 
only, and 13.7 for random buzzer or shock. ‘These figures are all 
significantly less than those of the corresponding experimental groups. 

Further results, such as difference in response latencies on test 
trials, tendencies to extinction, and observational records, indicate 
that the buzzer had more drive power for the experimental than for 
the control groups. 

The mediating mechanism between the buzzer and the crossing 
response on the test trials is believed to be mainly that of a pattern 
of neural excitation similar to that produced by shock on the training 
trials. 


(Manuscript received February 22, 1947) 
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FAVORABLE VERSUS UNFAVORABLE PROPAGANDA 
IN THE ENJOYMENT OF MUSIC 


BY MELVIN G. RIGG 
Oklahoma A. and M. College 


The data reported in the present article were secured just prio: 
to the writer’s entry into the Army, and publication was necessarily 
delayed until his release from active duty. 

‘To what extent are aesthetic experiences influenced by extraneous 
political and social considerations? During the first World War, 
German opera was banished from the American stage and German 
songs in artists’ recitals were suspect. In the more recent conflict 
these prejudices were much attenuated; however, the writer decided, 
during the early stages of the War, that it would be interesting to 
ascertain to what extent the appreciation of music could be changed 
by different kinds of propaganda. The term propaganda means, in 
the present study, any attempt to influence judgment, whether it is 
true or false, justified or unjustified. The unfavorable propaganda 
selected for this study emphasized the association of some of the 
selections with Hitler and German Nationalism. 


PROCEDURE 


The data were secured from college students, most of them enrolled in psychology courses. 
The same music was used throughout, and consisted of six phonograph recordings, as follows: 


Franck: D Minor Symphony, Second Movement, first side, Victor 8961 B. 

Wagner: Rhine Journey from Gotterdammerung, Conclusion, Victor 14008 B. 

Wagner: Die Meistersinger, Overture, Part II, second side, Columbia 68854 D. 
Wagner: Tristan and Isolde, Prelude, Part II, Columbia 67487 D. 

Beethoven: Eighth Symphony, Part IV, Third Movement, Minuet, Columbia 68904 D. 
Sibelius: Finlandia, second half, Victor 7412 B. 


The students indicated their enjoyment by making a check on a line 134 mm. long in a 
graphic rating scale, and the evaluation was expressed in terms of mm. corresponding to the 


distance from the left end of the line to the check. A portion of the rating sheet appeared as 
follows: 


Anh we N= 


ENJOYMENT OF MUSIC 


Explanation: For each selection place an X on the line to indicate how well you like the music. 
You may place this mark at any point along the line. The farther over to the right it is, the 
greater is the enjoyment indicated. For example, if two pieces were marked as follows: 


I dislike I I am I I like 
it very dislike indifferent like it very 
much it to it it much 
X 
Sample A......... ae 
X 
iota ancien deasccaniblacaldeiiahuipciad ipselansenenpmnnniiitliaietks 
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The indication would be that both were liked, but that Sample B was preferred to Sample A. 
Please begin by marking Selection No. 1 below. 


Students will vary widely in their appreciation of music, and it was necessary to ascertain 
for each person how well the pieces were liked before the effect of the propaganda could be meas- 
ured. The selections were consequently first presented without comment or information of any 
kind, and each student indicated his degree of enjoyment. On a subsequent day the records 
were again played, but the procedure was varied for the three different groups into which the 
students were divided. Group I heard favorable comments about all six selections. Group II 
was the Control Group, and heard no comment, not even the names of the pieces. Group III 
heard for selections 1 and 6 the same comment which had been given to Group I, but for the 
other records, selections 2, 3, 4, 5, the comment was unfavorable, and it is with these four selec- 
tions that the experiment is concerned. On this second occasion the student recorded his opinion 
upon a different sheet from the one used the first time; thus, he made the second judgment 
without having his first decision before him. 


The comments for the four experimental items are presented: 


(Favorable) “Selection No. 2 is from the opera The Twilight of the Gods. In a previous 
opera, Siegfried has won Brinnhilde by braving a circle of fire in the middle of which she was 
sleeping. The present selection is known as Siegfried’s Rhine Journey. After bidding his bride 
a tender farewell and presenting her with the famous magical ring, he starts off on the journey. 
In this music we can, perhaps, catch the spirit of the fearless hero as he rides forth to his destiny, 
and the very intensity of the music brings to us a feeling of the vitality of this young world of 
gods and heroes.” 

(Unfavorable) “Selection No. 2 is Stegfried’s Rhine Journey by Richard Wagner. In a 
series of four operas Wagner glorified the pagan gods and warriors of the ancient Germans. It is 
an interesting fact that Adolph Hitler is passionately fond of Wagner’s music, and many people 
think that Wagner expresses in this music the primitive brutality and the pagan anti-Christian 
tendencies which are a part of the Nazi philosophy. It is reported that before important diplo- 
matic conferences, Hitler frequently orders a performance of a Wagnerian opera at which he is 
the sole spectator, and it is supposed that he makes use of the music in order to work himself up 
to such a frame of mind that he is able to crush his adversaries and bring them to their knees.”’ 

(Favorable) “Selection No. 3 is from The Mastersingers, one of the world’s greatest operas, 
and one of the few with a happy ending. The portion played is from the overture, and we can 
imagine an audience eagerly awaiting the time for the curtain to rise. The overture contains 
an anticipation of some of the themes to be heard later, skillfully woven together. Perhaps if 
you listen closely, you will catch some phrases of the famous Prize Song which Walter sings in 
the contest, thereby winning the hand of Eva.” 

(Unfavorable) “Selection No. 3 is from Die Metstersinger (which means The Mastersingers), 
another opera by Richard Wagner. This is reputed to be Hitler’s favorite opera, and whenever 
music is desired for a state occasion in Nazi Germany, this is what is performed. Perhaps we 
can imagine the Nazi party chiefs, Goering with his medals, Goebbels with his sharp tongue, and 
the Fuhrer himself, all in a jovial mood. The portion played is a part of the overture.” 

(Favorable) “Selection No. 4 is from Tristan and Isolde, regarded by many as the world’s 
most famous love opera. ‘The portion heard is part of the prelude, and we can imagine that we 
are soon to witness the scene on board the ship that is carrying the Princess Isolde to Cornwall 
to become the unwilling bride of King Mark. The intense yearning of the music, which surges 
endlessly without ever coming to rest, typifies the hopeless and tragic love of Isolde and Tristan, 
which carries them on to their destruction.” 

(Unfavorable) “Selection No. 4 is from Tristan and Isolde, another opera by Richard 
Wagner. This is supposed to be the world’s most passionate love music, depicting the yearning 
of Tristan for Isolde, the wife of another man. Richard Wagner should have known how to 
write such music, for he experienced a similar episode in his own life in his love for the wife of a 
fellow musician, Von Bilow, an affair which resulted in a divorce and in her subsequent marriage 
with Wagner. ‘The portion played is part of the prelude.” 

(Favorable) ‘Selection No. 5 is from the Eighth Symphony of Beethoven. The portion 
played is the third movement and is a1.inuet. There is a wholesome, matter of fact quality in 
Beethoven’s music. It is never chaotic, never extravagant, and Beethoven is, perhaps, the most 
generally appreciated of all composers.” 
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(Unfavorable) “Selection No. 5 is from the Eighth Symphony of Beethoven, another Ger- 
an composer. Beethoven belongs to the Germany of one hundred years ago, before the rise of 
(serman Imperialism. ‘The portion played is the third movement and is a minuet.” 
This last comment was inserted to ascertain whether the mere mention that Beethoven wa 
a (serman would, after the previous statements, have an adverse effect. 


RESULTS AND Discussion 


Since all four experimental selections showed the same tendency, 
the general results appear to best advantage if the four scores are 
combined into one, as in Table I. 


TABLE I 


CoMBINED SCORES FOR SELECTIONS 2, 3, 4, AND 5 





| 
Group Averages 
Group | N Comment Gain 





Ist Hearing 


2nd Hearing 








62 | Favorable 





| 
| 
| 
| 


I | | 355-92 384.44 28.52 
I] 63 | None 300.81 314.51 13.70 
II] | 39 | Unfavorable | 350.72 354.90 4.18 


The gain made by the Control Group appears to be the result 
merely of hearing the music a second time. The unfavorable com- 
ments offered to Group III practically erased this gain, while the 
favorable propaganda given to Group I had the effect (on the basis 
of the scale that was used) of doubling it. The adverse propaganda 
did not, however, cause an absolute decrease in the appreciation of 
those in Group III; so that it appears that the students did not en- 
tirely forget the music in a maze of extraneous prejudices. 

It will be noted that there was a wide variation in the initial ap- 
preciation. Not only do the group averages show large differences, 
but within each group there is also great variability. Since these 
differences in appreciation on first hearing might have an influence on 
the gains that were made, it was decided to remove the effect of the 
‘first hearing’ scores by the analysis of covariance. It was also be- 
lieved that many other variable factors are in effect dependent upon 
this one, since age, sex, and musical training are of concern in the 
present experiment chiefly because of their possible effects upon 


-appreciation.! Since the F-value obtained was significant at better 


than the one percent level, it may be concluded that the three groups, 
when they heard the musical selections for the second time, showed 


' The enjoyment of music normally increases with musical training. Miss Geneva Williams, 
while a graduate student at this institution, prepared a report showing that 162 students with 


two years or more of musical training were much higher in their appreciation of music than were 
93 students with no musical training. (Critical ratio, 4.55.) Miss Williams (1) subsequently 


made a further investigation along lines developed by the writer. 
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real differences not to be attributed to variations in appreciation at 
the time of the first hearing. The most obvious reason for these 
differences on second hearing is the variation in the propaganda. 


SUMMARY 


College students indicated on a graphic rating scale their en- 
joyment of certain phonographic recordings. At the first hearing 
there was no comment. At the second hearing the music was pre- 
sented to Group I in a romantic light, in Group II (control) there was 
no comment, while for Group III the music was associated with 
Hitler and German Nationalism. The respective mean gains were 
28.52, 13.70, and 4.18. ‘Thus the unfavorable propaganda was almost 
enough to erase the gain which comes from a second hearing without 
comment, while the favorable propaganda produced, on the basis of 
the scale that was used, a gain twice that of the control group. An 
analysis of covariance showed highly significant differences between 
the groups after the effect of the students’ initial appreciation of the 
music had been removed. 


(Manuscript received April 14, 1947) 
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THE ABILITY OF RATS TO LEARN THE LOCATION 
OF FOOD WHEN MOTIVATED BY THIRST— 
AN EXPERIMENTAL REPLY TO LEEPER 


BY HOWARD H. KENDLER AND HELEN CHAMBERLAIN MENCHER 


University of Colorado * 


Leeper’s comments (6) on the experiments by Spence and Lippitt 
(7) and by Kendler (3, 4) include a general criticism of these studies, 
as well as a reformulation of his perceptual theory of learning as ap- 
plied to the type of experimental situations used in the Iowa studies. 
The paper by Spence and Kendler (8) evaiuates the general theore- 
tical and methodological points raised by Leeper. However to evalu- 
ate Leeper’s new formulation one must resort to experimentation. 

The main criticism Leeper levels against the Iowa studies is that 
the animals did not adequately ‘perceive’ the goal object for which 
they were not motivated. Believing his study (5) provided adequate 
perceptual conditions, while the Iowa studies failed to do so, he 
writes, in describing his own experiment that “the situation was such 
as to guarantee that, immediately after making the run, the rat would 
approach the pan containing the undesired goal-material and perceive 
discriminatingly what it contained”’ (italicized by Leeper, 6, p. 105). 

‘The present investigation was designed to meet these revised 
experimental specifications presented by Leeper. The technique 
utilized was simple. A single unit T-maze with a tray containing 
five glasses in each goal box was the apparatus used. In both goal 
boxes one of the five small glasses contained water. On one side the 
remaining four glasses contained food while in the other goal box the 
remaining four containers were empty. The position of the glass 
containing the water was varied from trial to trial so that the animal 
when thirsty during the training trials would be forced when searching 
for water, to ‘perceive discriminatingly’ what the glasses in each goal 
box contained. Since both spatial responses led to a water reward 
the ‘temporal contiguity’ between the choice point and the significates 
in both goal boxes would be approximately similar. According to 
Leeper a cognition should be formed by the thirsty animals, during 
the training trials, that food was present in one of the goal boxes. 
When the animals would be made hungry, they should utilize their 
knowledge and go to the side containing food. 


* This investigation was aided by a grant from the Council on Research and Creative Work 
of the Graduate School of the University of Colorado. 
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EXPERIMENTAL PROCEDURE 


‘ } ° . 
Su 07€CtS 
The Ss for this experiment were 36 male albino rats (approximately 90 days of age) from the 
colony maintained by the Department of Psychology of the University of Colorado. None of 
the animals had previous experience in a maze. 


Apparatus 


A ground plan and detailed description of the apparatus were given in a previous article (2). 
It is sufficient to say here that the apparatus was a single-choice T-maze with goal boxes slightly 
less than a square foot in area. The left half of the maze was painted black while the right side 
was painted white. This differentiation existed for the stem as well as for the arms of the T-maze. 
Curtains were hung before each goal box in order to preclude the possibility of the animal’s seeing 
what was present in the goal box prior to its choice. In each goal box was a sheet metal tray 
2.5 in. high, 2.75 in. wide, and 10 in. long which was placed flat against the wall farthest from 
the choice point. On top of the tray five holes, 1.75 in. in diameter, permitted the insertion of 
five jigger glasses. The glasses, as well as the tray, were painted white or black in accordance 
with the color of the goal box. The rims of these glasses were .63 in. above the top of the tray. 
[lumination was provided by a 150 watt reflector-flood bulb placed six feet directly above the 
choice point. 


Preliminary Training 


After each animal had been handled from three to five min. daily for approximately a week, 
a two-day preliminary training period in the maze was begun. 

Day 1. Adjustment to the apparatus.—All doors were raised and the animal was placed at 
the beginning of the maze and was permitted to explore freely for 40 min. During this time 
both metal trays with the glass containers were removed. The S’s initial choice was recorded. 

Day 2. Test for position habit.—The animals were given two trials under the same maze 
conditions as the previous day, with the exception that the starting compartment was utilized. 
Vertically sliding doors, controlled by the £ behind a screen, prevented retracing. The S’s 
choices were recorded and their position preference, as exhibited by the three free choice trials 
luring the preliminary training, was ascertained. The non-correction method was used through- 
jut the course of the experiment. The metal trays with the glass containers were also removed 
juring the second day of preliminary training. 


Training Series 


During the training series the animals were motivated by thirst. Water was present in 
nly one of the five glass containers in both goal boxes. The position of the glass containing 
water was varied regularly from trial to trial in a sequence of 1-3-5—2~—4, with the numbers repre- 
senting the position of the glasses from left to right in the black goal box and right to left in the 
white goal box. Ifan S chose one position consistently, water was never placed in that container. 
For half of the animals, the remaining four containers on the left side contained food (Purina 
dog checkers) while the extra four containers on the right side were kept empty. This situation 
was reversed for the other animals. Half of the animals found food on their preferred side while 
the other half found food on their non-preferred side. When a glass either contained food or 
water it was filled up to.5 in. of the rim, thus requiring the animal to look down into the container. 

The thirst motivation was controlled by making water available to the Ss for 30 min. daily 
beginning approximately 224 hours prior to the daily experimental trials. Satiation for food was 
regulated by having a large number of Purina dog checkers constantly in the home cages. In 
order to produce as complete food satiation as possible the animals were given a plate of Purina 
kibble meal one hour prior to their daily experimental session. The animals, although rarely 
eating the checkers at this time, would consume some kibble meal. In no instance did any S 
consume any of the food in the maze during the training series. 

The training series consisted of four trials daily for seven days. ‘The initial run of each day 
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was a free choice, the second was a forced trial to the side opposite that chosen on trial one, the 
third was a free choice, while the fourth was a forced run to the side opposite that chosen 
trial three. ‘Thus, the animals had equal experience with the contents of both goal boxes. Forced 
trials were accompanied by lowering one of the doors on either side of the choice point. 

The animals were allowed to drink the water for five sec. in either goal box. Food wa. 
placed behind the goal box containing the four empty glasses in order to minimize any possible 
olfactory cues. 

The daily experimental session occurred between 1 P.M. and 5 P.m., with at least 15 mi: 
between successive trials. 


Test Serves 

The test series consisted of four daily trials for five successive days, beginning on the da: 
following the last dav of the training series. 

The motivation of the animals during the test series was hunger. After the last trial of the 
training series, all food was removed from the home cages and the water bottles were inserted. 
On the second and remaining days of the test series the animals received eight gm. of food after 
an interval of at least 15 min. following the last test trial. The nozzle of the water bottle wa 
always available in the home cages. One hour prior to the experimental session a dish of water 
was inserted into the cage to induce further drinking. 

Just prior to the test series the 36 Ss were divided into two groups; the constant group and 
the switched group. ‘These two groups were equated for several factors. They contained the 
same number of animals which had food in the left goal box during the training series. ‘They 
were equated on basis of performance on the free choice trials during the training series. ‘The 
constant group had a total of 135 free choices to the side containing the four glasses with food 
during the seven day training series, while the switched group had 133 free choices to this side. 
On the last day of the training trials the constant group had 17 free choices to the food side while 
the switched group had 18 free choices. 

For the constant group all the containers on the side which had the four glasses with food 
luring the training series contained Purina dog checkers during the test series. The position of 
food for Group C was constant throughout the training and test series. For the switched group 
the glasses on the side which during the training trials had contained four empty glasses, now al! 
contained food. The position of food was switched between the training and test series for 
Group S. 

The S’s choice was final, once it passed either of the doors on both sides of the choice point. 
If it went to the side which contained food it was permitted to eat for 20 sec. If it made a wrong 
response it was kept in the goal box with the empty glasses for an equivalent period of time. 


RESULTS AND DISCUSSION 
Training Series 


The Ss’ behavior in the goal boxes during the training series is of 
particular importance. The validity of our conclusion depends upon 
whether our animals “. would approach the pan containing the 
undesired goal-material and perceive discriminatingly what it contained” 
(6). ‘The typical behavior exhibited by the animals in the goal box 
was to approach the tray, place their front paws on the table top 
and look into one of the glass containers. The rat would frequently 
insert his head into the container as far as possible. This perform- 
ance would be repeated until the S would approach the glass contain- 
ing water at which time it would begin drinking. During the latter 
trials of the training series the rapidity of the rats’ discriminatory be- 
havior increased. 
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That the rat was unable to respond directly to the glass containing 
water is supported by the fact that during the last six days of the train- 
ing series ' (24 trials) the Ss went directly to the glass containing water 
on an average of 3.42 times. On the remaining trials they perceived 
the insides of at least one of the other containers. ‘The fact that this 
behavior of going directly towards the water container was below 
chance expectancy (20 percent) can probably be attributed to the 
fact that when a S exhibited a consistent preference to go to one of 
the five glasses the F never placed the water in that glass. 

Another consideration in evaluating the adequacy of the present 
experimental technique is that there should be no appreciable differ- 
ence between the running times on the free as compared with the 
forced trials, as well as between the speed of running towards the 
goal box containing food and the goal box containing the empty 
glasses. Since an adequate perception depends to some extent on 
‘temporal contiguity’ it is necessary that this condition be met. The 
results indicate that on the last day of the training series the Ss had 
a mean running time of 5.24 sec. on their free trials as compared with 
a latency of 4.59 on their forced trials. The mean latency on the 
trials in which the animals went to the goal box containing food was 
4.37 sec. as compared with the mean running time of 5.46 sec. to the 
side containing the four empty glasses. None of these differences 
was statistically significant. The median latency for the free choice 
trials was three sec. while for the other groupings the median was 
four sec. Therefore it appears that the assumption of equal ‘tem- 
poral contiguity’ for the different response groups was met. 

During the training series, as well as the test for position habit, 
the animals exhibited a definite preference for the left side of the 
maze (painted black). This factor was equated in the formation 
f the constant and switched groups for the test series. 


Test Series 


According to Leeper the animals should have acquired a cognition 
during the training trials that food was constantly present in one of 
the goal boxes. This cognition should have been utilized when the 
S’s motivation was changed to hunger, resulting in appropriate re- 
sponses to the side which contained food during the training trials. 

The performance on the first test trial can be subjected to two 
statistical questions. The first question is whether the animals re- 
acted appropriately to their new hunger drive by choosing the side 
which contained food during the training trials? The second question 
is whether the Ss exhibited any significant change in behavior be- 
tween the last training trials and the first test trial? 


1 Adequate records were not taken during the first day of the training series. 
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In answer to the first question we find that on the first test trial, 
when the animals were motivated by hunger, that 19 of the 36 Ss 
52.78 percent) chose the side which led to food during the training 
series. Since the standard error of 50 percent in this case would be 
8.33 percent, we cannot reasonably reject the hypothesis that the 
animals’ choices on the first test trial resulted from chance variation. 
In answer to the second question we find that on the last day of 
the training series the animals on their free choice trials chose the 
side leading to the goal box with the four containers of food 48.61 
percent. On the first test trial they chose this same side 52.78 per- 
cent. ‘This difference of 4.17 percent is only .41 times its standard 
error (10.19 percent). Consequently the change in behavior induced 
by the new motivational state was not of sufficient degree to reject 
the assumption that it might be due to chance variation. 
We find the data of the first test trial to be contrary to the theo- 
retical expectation of Leeper’s assumption. 
Table I indicates the number of correct responses made by the 


TABLE I 


Tue Mean NumBer or Correct Responses DurinGc THE TEST SERIES 
FOR THE CONSTANT AND SWITCHED Groups 











Group M o om Cm. 
men PS | | scdiiieneeieiNeteaaaaniaiead 
C | 16.28 1.70 41 
< | “e | oe | : 90 
15.67 | 2.20 | 53 





constant and switched groups during the series of 20 test trials. Be- 
cause the food remained in the same goal box during the training and 
test trials for the animals of the constant group, it would be expected 
that they should exhibit a greater number of correct responses during 
the test series when compared to the switched group. However this 
prediction is not supported by the data. The difference obtained 
would be expected, assuming the null hypothesis, 18.41 percent oi 
the time in the direction obtained. 

In such problems as the present one, one must always consider 
the possibility that only some of the rats are capable of exhibiting 
the behavior pattern which is being investigated. ‘Therefore an 
analysis of individual data as well as group data is necessary. ‘The 
results of the test trials reveal that only one animal made 20 correct 
responses during the test series—and this animal was a member 0 
the switched group. If we assume that seven successive correct re- 
sponses * beginning from the first test trial would be indicative of the 


2 The choice of this criterion was based upon the fact that such a run of successes when the 
probability of a success in a single trial is one-half would occur less than one percent of the time. 
Of course, in a learning situation one trial is not independent of the previous trials. 
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learning of the place of food in the training trials, we discover that 
four animals in the constant group behaved in such a manner while 
three Ss in the switched group did likewise. ‘This difference is ob- 
viously not significant. Consequently the analysis of the behavior 
of the individual Ss provides no more support to the perceptual theory 
than do the group results. 

In evaluating these results it is important to consider whether 
the present experimental design fulfills the requirements of Leeper’s 
revised perceptual theory. It is felt that such requirements have 
been satisfied * and consequently this experiment can be considered 
an adequate test of his theory. Obviously the results do not lend 
any confirmation to Leeper’s formulation or his explanation of the 
differences between his results (§) and those of Hull (1).4 


SUMMARY 


The present investigation was designed to evaluate experiment- 
ally Leeper’s criticism of the studies by Spence and Lippitt and by 
Kendler. Albino rats, under thirst motivation, were subjected to a 
single-choice T-maze situation in which an inverted tray holding five 
glasses was in each goal box. One glass in each tray contained water 
while the remaining four glasses on one side contained food and the 
extra four glasses in the other goal box remained empty. During 
the training series, which lasted seven days, the Ss were subjected 
to four daily trials (two free-choice and two forced). Since the posi- 
tion of the glass containing water varied in an irregular fashion the 
animals were forced to ‘perceive discriminatingly’ the contents of the 
glass containers in each goal box. Because both spatial responses 
were rewarded the time intervals between the perceptions of the 
choice point and the contents of both goal boxes were approximately 
equal. Therefore, according to Leeper, a cognition of the location of 
the unwanted goal object (food) should have been formed during the 
training series. ‘The results of the test series, during which time the 


3One might object that the present design failed to meet the specifications of the phrase ‘on 
a wrong run’ which immediately preceded Leeper’s italicized statement “the situation was such as 
to guarantee that, immediately after making the run, the rat would approach the pan containing the 
undesired goal-material and perceive discriminatingly what it contained.”’ In order to insure that 
the time interval between the occurrence of the perceptions of the choice point and the contents 
of either goal box would be approximately the same, it was thought necessary to reward both 
responses. Since Leeper did not italicize this phrase, and since nothing is stated in the perceptual 
theory which gives greater weight to the importance of wrong runs as contrasted with correct 
runs in the formation of adequate cognitions, little consideration was given to this possible 
objection. 

‘In an experiment very similar to ours, Dr. G. Robert Grice independently obtained essen- 
tially the same results. His experiment mainly differed in that food was presented in a large 
quantity rather than in individual cups. 
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animals were hungry, provided no support to Leeper’s perceptua! 
theory. 


(\lanuscript received for immediate publication November 3, 1947) 
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STUDIES OF FEAR AS AN ACQUIRABLE DRIVE: I. 
FEAR AS MOTIVATION AND FEAR-REDUCTION 
AS REINFORCEMENT IN THE LEARNING 
OF NEW RESPONSES! 


BY NEAL E. MILLER 


Yale University 


An important role in human behavior is played by drives, such 
as fears, or desires for money, approval, or status, which appear 
to be learned during the socialization of the individual (1, 12, 16, 17, 
18,). While some studies have indicated that drives can be learned 
2,8, 15), the systematic experimental investigation of acquired drives 
nas been scarcely begun. A great deal more work has been done on 
the innate, or primary drives such as hunger, thirst, and sex. 


Fear is one of the most important of the acquirable drives because it can 
be acquired so readily and can become so strong. The great strength which 
fear can possess has been experimentally demonstrated in studies of conflict 
behavior. In one of these studies (3) it was found that albino rats, trained 
to run down an alley to secure food at a distinctive place and motivated by 
46-hour hunger, would pull with a force of 50 gm. if they were restrained 
near the food. Other animals, that had learned to run away from the end of 
the same alley to escape electric shock, pulled with a force of 200 gm. when 
they were restrained near that place on trials during which they were not 
shocked and presumably were motivated only by fear. Furthermore, ani- 
mals, that were first trained to run to the end of the alley to secure food and 
then given a moderately strong electric shock there, remained well away 
from the end of the alley, demonstrating that the habits motivated by fear 
were prepotent over those motivated by 46-hour hunger (9)*. This ex- 
perimental evidence is paralleled by many clinical observations which in- 
dicate that fear (or anxiety as it is called when its source is vague or ob- 
scured by repression) plays a leading role in the production of neurotic 


behavior (5, 6). 


The purpose of the present experiment was to determine whether 
1 not once fear is established as a new response to a given situation, 


'This study is part of the research program of the Institute of Human Relations, Yale 
University. It was first reported as part of a paper at the 1941 meetings of the A.P.A. The 
author is indebted to Fred D. Sheffield for assistance in the exploratory work involved in estab- 
ushing the experimental procedure and for criticizing the manuscript. 

2 In both of these experiments the 46-hour food deprivation was made more effective by the 
tact that the animals had been habituated to a regular feeding schedule and maintained on a diet 
that was quantitatively restricted enough to keep them very thin but qualitatively enriched with 
Drewer’s yeast, cod liver oil, and greens to keep them healthy. 
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it will exhibit the following functional properties characteristic of 
primary drives, such as hunger: (a) when present motivate so-called 
random behavior and (b) when suddenly reduced serve as a rein- 
forcement to produce learning of the immediately preceding response. 


APPARATUS AND PROCEDURE 


The apparatus used in this experiment is illustrated in Fig. 1. It consisted of two compart- 
ments: one white with a grid as a floor and the other black with a smooth solid floor. Both of 
































Fic. 1. Acquired drive apparatus. The left compartment is painted white, the right one 
black. A shock may be administered through the grid which is the floor of the white compart- 
ment. When the animal is placed on the grid which is pivoted at the inside end, it moves down 
slightly making a contact that starts an electric timer. When the animal performs the correct 
response, turning the wheel or pressing the bar as the case may be, he stops the clock and actuate: 
a solenoid which allows the door, painted with horizontal black and white stripes, to drop. ‘The 
F can also cause the door to drop by pressing a button. The dimensions of each compartment 
are 18 X 6 X 8} in. 


these had a glass front to enable the experimenter to observe the animal’s behavior. The two 
compartments were separated by a door which was painted with horizontal black and white 
stripes. This door was held up by a catch operated by a solenoid and could be caused to drop 
in any one of three different ways: (a) by the E pushing a button, (b) by the rat moving a little 
cylindrical wheel made of horizontal rods stretched between bakelite disks and exposed above 
the right hand half of the door, (c) by a bar projecting 1} in. from the side of the apparatus in 
front of the upper left hand corner of the door. 

The support of the grid was pivoted at the end near the door and held slightly above a con- 
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tact by a little spring at the far end. Placing the rat into the apparatus caused the grid to move 
town a fraction of an inch and close the contact. This started an electric clock. When the 
animal caused the door to drop by rotating the wheel a fraction of a turn or pressing the bar 
(depending upon the way the apparatus was set), he stopped the clock which timed his response 
The wheel was attached to a ratchet in such a way that the part of it facing the rat could only 
be moved downward. A brush riding on a segment of the wheel which projected through the 
back of the apparatus was arranged in such a way that each quarter of a revolution was recorded 
, an electric counter. 
The animals used in this experiment were male albino rats approximately six months old. 
They had been tamed by handling but had not been used in any other experiment. 
lowed plenty of food and water in their home cages at all times. 
The procedure involved the following five steps: 


They were 


1. Test for initial response to apparatus.—The animals were placed in the apparatus for 
approximately one min. with the door between the two compartments open and their behavior 
was observed. 

2. Trials with primary drive of pain produced by electric shock.—The procedure for adminis- 
tering shock was designed to attach the response of fear to as many as possible of the cues in the 
white compartment instead of merely to the relatively transient stimulus trace of just having 
been dropped in. This was done so that the animal would remain frightened when he was 
restrained in the compartment on subsequent non-shock trials. The strength of shock used was 
soo volts of 60 cycle AC through a series resistance of 250,000 ohms. ‘The animals were given 
10 trials with shock. On the first trial they were allowed to remain in the white compartment 
for 60 sec. without shock and then given a momentary shock every five sec. for 60 sec. At the 
end of this period of time the EF dropped the door and put a continuous shock on the grid. 

As soon as the animal had run into the black compartment, the door was closed behind him 
and he was allowed to remain there for 30 sec. Then he was taken out and placed in a cage of 
wire mesh approximately nine in. in diameter and seven in. high for the time between trials. 
Since the animals were run in rotation in groups of three, the time between trials was that required 
to run the other two animals, but was never allowed to fall below 60 sec. 
followed on all subsequent trials. 

On the second trial the animal was placed into the center of the white compartment facing 
away from the door, was kept there for 30 sec. without shock, at the end of which time the shock 
was turned on and the door opened. On trials 3 through 1o the grid was electrified before the 
animal was dropped on it and the door was opened before he reached it. On odd numbered trials 
the animal was dropped at the end of the compartment away from the door and facing it; on even 
numbered trials he was dropped in the center of the compartment facing away from the door. 

3. Non-shock trials with experimenter dropping door.—The purpose of these trials was to 
determine whether or not the animals would continue to perform the original habit in the absence 
of the primary drive of pain from electric shock, and to reduce their tendency to crouch in the 
white compartment and to draw back in response to the sound and movement of the door drop- 
ping in front of them.? Each animal was given five of these non-shock trials during which the 


This procedure was 





3 During the training in the next step (learning to rotate the wheel), crouching would inter- 
fere with the type of responses necessary in order to hit the wheel and withdrawing would prevent 
the animals from going into the black compartment and having their fear reduced immediately 
after hitting the wheel. Apparently crouching occupies a dominant position in the innate 
hierarchy of responses to fear. Similarly withdrawing seems to be either an innate or a previously 
learned response to the pattern of fear plus a sudden stimulus in front of the animal. During 
the shock trials the response of fear is learned to the pattern of shock plus white compartment 
and the responses of running are learned to the pattern of shock plus stimuli produced by the 
fear response plus the cues in the white compartment. When the shock stimulus drops out of 
the pattern, the generalized fear and running responses elicited by the remainder of the pattern 
are weaker. The innate crouching response to fear is then in conflict with the generalized running 
responses to the pattern of fear plus cues in the alley. If the door is closed, the extinction of 
running and other related responses may reduce their strength to the point where crouching 
becomes dominant. If the door is dropped in front of the animal so that he can immediately 
run out of the white compartment, the reduction in the strength of fear will be expected to 
strengthen the relative dominance of running and related responses to the stimulus of fear plus 
the cues in the white compartment and the sight and sound of the door dropping. 
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[, dropped the door before the animal reached it. As with the preceding trials the animals were 
dropped in facing the door on odd numbered trials and facing away from it on even numbered 
ones; they were allowed to remain in the black compartment for 30 sec. and were kept in the 
wire mesh cage for at least 60 sec. between trials. 

4. Non-shock trials with door opened by turning the wheel.—The purpose of these trials was to 
determine whether the continued running without shock was the mere automatic persistence of 
a simple habit, or whether an acquired drive was involved which could be used to motivate the 
learning of a new habit. During these trials the E no longer dropped the door. The apparatus 
was set so that the only way the door could be dropped was by moving the wheel a small fraction 
of aturn. The bar was present but pressing it would not cause the door to drop. The animals 
that moved the wheel and caused the door to drop were allowed to remain 30 sec. in the black 
compartment. Those that did not move the wheel within 100 sec. were picked out of the white 
compartment at the end of that time. All animals remained at least 60 sec. between trials in the 
wire mesh cage. All animals were given 16 trials under these conditions. On each trial the 
time to move the wheel enough to drop the door was recorded on an electric clock and read to the 
nearest roth of a sec. 


5. Non-shock trials with door opened by pressing the bar.—The purpose of these trials was to 
determine whether or not animals (a) would unlearn the first new habit of turning the wheel if 
this habit was no longer effective in dropping the door, and (b) would learn a second new habit, 
pressing the bar, if this would cause the door to drop and allow them to remove themselves from 
the cues arousing the fear. Animals that had adopted the habit of crouching in the white com- 
partment till the end of the 100 sec. limit and so had not learned to rotate the wheel were excluded 
from this part of the experiment. These trials were given in exactly the same way as the pre- 
ceding ones except that the apparatus was set so that turning the wheel would not cause the door 
to drop but pressing the bar would. During these trials there was no time limit; the animals were 
allowed to remain in the white compartment until they finally pressed the bar.‘ The time to 
press the bar was recorded on an electric clock to the nearest 10th of a sec. and the number of 
revolutions of the wheel was recorded on an electric counter in quarter revolutions. 


SUGGESTED IMPROVEMENTS IN PROCEDURE 


In the light of further theoretical analysis and experimental results it is 
believed that the above procedure could be improved by the following 
changes: (a) Have the door drop down only part of the way so that it remains 
as a hurdle approximately two in. high over which the animals have to 
climb, thus introducing components of standing up and reaching into the 
initial response. This should favor the subsequent occurrence of wheel 
turning or bar pressing. (b) Connect the door to an electronic relay so 
that it will fall when touched and require the animals to touch it in order 
to make it fall during steps 2 and 3 of the experiment. This should tend 
to accomplish the same purpose as the preceding change and also insure 
that the animals have the response of running through the door attached 
to the stimulus produced by its dropping when they are very close to it. 
(c) Increase the number of non-shock trials in step 3 to approximately 12 in 
order to further counteract crouching. (d) At the end of the time limit in 
step 4, drop the door in front of the animal instead of lifting him out of the 
white compartment. This should tend to maintain the strength of the 
habit of going through the door and make it less likely that crouching or 
sitting will be learned. 


RESULTS 


In the test before the training with electric shock, the animals 
showed no readily discernible avoidance or preference for either of the 


*QOne animal which did not hit the bar within 30 min. was finally discarded. 
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two chambers of the apparatus. ‘They explored freely through both 
f them. 

During the trials with primary drive cf pain produced by electric 
shock, all of the animals learned to run rapidly from the white com- 
partment through the door, which was dropped in front of them by 
the £, and into the black compartment. On the five trials without 
shock, and with the £ still dropping the door, the animals continued 
to run. The behavior of the animals was markedly different from 
what it had been before the training with the primary drive of pain 
from electric shock. 

When the procedure of the non-shock trials was changed so that 
the £ no longer dropped the door and it could only be opened by 
moving the wheel, the animals displayed variable behavior which 
tended to be concentrated in the region of the door. They would 
stand up in front of it, place their paws upon it, sniff around the edges, 
bite the bars of the grid they were standing on, run back and forth, 
‘tc. They also tended to crouch, urinate, and defecate. In the 
course of this behavior some of the animals performed responses, 
such as poking their noses between the bars of the wheel or placing 
their paws upon it, which caused it to move a fraction of a turn and 
actuate a contact that caused the door to open. Most of them then 
ran through into the black compartment almost immediately. A 
few of them drew back with an exaggerated startle response and 
crouched. Some of these eventually learned to go through the door; 
a few seemed to learn to avoid it. Other animal abandoned their 
trial-and-error behavior before they happened to strike the wheel and 
persisted in crouching so that they had to be lifted out of the white 
compartment at the end of the 100 sec. period. In general, the ani- 
mals that had to be lifted out seemed to crouch sooner and sooner 
on successive trials. 

Thirteen of the 25 animals moved the wheel enough to drop the 
door on four or more out of their first eight trials. Since, according 
to theory, a response has to occur before it can be reinforced and 
learned, the results of these animals were analyzed separately and 
they were the only ones which were subsequently used in the bar- 
pressing phase of the experiment.*® ‘The average speed (reciprocal 
of time in seconds) with which these animals opened the door by 
moving the wheel on the 16 successive trials is presented in Fig. 2. 
It can be seen that there is a definite tendency for the animals to 


( 


> In a subsequent experiment (13) in which further steps suggested by the theoretical analysis 
(see footnote 3 and SuGGEsTeD IMPROVEMENTS IN PRocEDURE) were taken to get rid of the 
-rouching, none of the 24 animals in the group which had received the strong shock had to be 


eliminated for crouching; all of them learned to perform the new response during the non-shock 
trials. 
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learn to turn the wheel more rapidly on successive trials. Eleven 
out of the 13 individual animals turned the wheel sooner on the 16th 
than on the first trial, and the two animals which did not show im- 
provement were ones which happened to turn the wheel fairly soon 
on the first trial and continued this performance throughout. The 
difference between the average speed on the first and 16th trials is of 
a magnitude (¢t = 3.5) which would be expected to occur in the direc- 
tion predicted by theory, less than two times in 1000 by chance. 
Vherefore, it must be concluded that those animals that did turn the 
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TRIALS WITH WHEEL FUNCTIONING TO OPEN DOOR 


Fic. 2. Learning the first new habit, turning the wheel, during trials without primary drive. 
With mild pain produced by an electric shock as a primary drive, the animals have learned to 
run from the white compartment, through the open door, into the black compartment. Then 
they were given trials without any electric shock during which the door was closed but could be 
opened by turning a little wheel. Under these conditions the 13 out of the 25 animals which 
turned the wheel enough to drop the door on four or more of the first eight trials learned to turn it. 
This figure shows the progressive increase in the average speed with which these 13 animals ran 
up to the wheel and turned it enough to drop the door during the 16 non-shock trials. 


Wheel and run out of the white compartment into the black one 
definitely learned to perform this new response more rapidly during 
the 16 trials without the primary drive of pain produced by electric 
shock. 

When the setting on the apparatus was changed so that the whee! 
would not open the door but the bar would, the animals continued 
to respond to the wheel vigorously for some time. It was obvious 
that they had learned a strong habit of responding toit. Eventually, 
however, they stopped reacting to the wheel and began to perform 
other responses. After longer or shorter periods of variable behavior 
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they finally hit the bar, caused the door to drop, and ran throug! 
rapidly into the black compartment. On the first trial the numbe: 
f complete rotations of the wheel ranged from zero to 530 with a 
median of 4.75. On successive trials during which turning the wheel 
lid not cause the door to drop, the amount of activity on it progres- 
ively dropped till by the tenth trial the range was from 0 to 0.25 
rotations with a median of zero. ‘The progressive decrease in the 
amount of activity on the wheel is shown in Fig. 3. It is plotted in 
medians because of the skewed nature of 
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TRIALS WITH WHEEL NON—FUNCTIONAL, BAR FUNCTIONAL 


Fic. 3. Unlearning of the habit of turning the wheel during trials on which it 
serves to reduce the acquired drive. When conditions were chan 
was ineffective (and pressing the bar was effective) in causing the door to drop and allowing the 
animal to run from the white into the black compartment, the animals showed a progressive 
lecrement in the response of rotating the wheel. 

(3 animals. 


no longer 
ged so that turning the wheel 


Fach point is based on the median scores of 


out of the 13 rats which were used in this part of the experiment gave 
iewer rotations of the wheel on the tenth than on the first trial. 
from the binomial expansion it may be calculated that for 12 out of 
[3 cases to come out in the direction predicted by the theory is an 
event which would be expected to occur by chance less than one time 
in 1000. ‘Thus, it may be concluded that the dropping of the door, 
which is presumed to have produced a reduction in the strength of 
tear by allowing the animals to excape from the cues in the white 
compartment which elicited the fear, was essential to the mainte- 
nance of the habit of rotating the wheel. 
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TRIALS WITH WHEEL NON-FUNCTIONAL, BAR FUNCTIONAL 


Fic. 4. Learning a second new habit, bar pressing, under acquired drive. Conditions were 
changed so that only pressing the bar would cause the door to drop and allow the animals to ru: 
from the white compartment where they had been previously shocked, into the black one where 
they had escaped shock. During non-shock trials under these conditions, the animals learned a 
second new habit, pressing the bar. Each point is based on the average speed of 13 animals. 


The results on bar pressing are presented in Fig. 4. It can be seen 
that the speed of bar pressing increased throughout the 10 non-shock 
trials during which that response caused the door to drop. Since the 
last trial was faster than the first for 12 out of the 13 animals, the 
difference was again one which would be expected by chance less than 
one time in 1000. 


DiIscuUSSION 


On preliminary tests conducted before the training with electric 
shock was begun, the animals showed no noticeable tendency to 
avoid the white compartment. During training with the primary 
drive of pain produced by electric shock in the white compartment, 
the animals learned a strong habit of quickly running out of it, 
through the open door, and into the black compartment. 

On non-shock trials the animals persisted in running from the 
white compartment through the open door into the black one. On 
additional non-shock trials during which the door was not automati- 
cally dropped in front of the animals, they exhibited so-called random 
behavior and learned a new response, turning the wheel, which 
caused the door to drop and allowed them to escape into the black 
compartment. ‘This trial-and-error learning of a new response de- 
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monstrated that the cues in the white compartment had acquired 
the functional properties of a drive and that escape from the white 
into the black compartment had acquired the functional properties 
{ a reward. 
At this point the results of two later experiments which serve as 
mntrols should be briefly mentioned. One of these (13)demonstrated 
that the capacity of the cues in the two compartments to motivate 
and reinforce new learning was a function of the strength of the 
primary drive involved in the previous stage of the training. Animals 
put through the same procedure in every respect except that the 
primary drive was a weak one produced by a go volt electric shock 
showed no tendency to learn a new habit (which in this case was bar 
pressing) on subsequent non-shock trials. Animals, given their initia] 
training with a stronger primary drive produced by a 540 volt shock, 
showed rapid learning of the new response on subsequent non-shock 
trials. For these two groups all other features of the experiment 
were exactly the same including possible initial preferences for the 
different features of the two compartments and trials of running in the 
apparatus with the last response to the cues in the white compart- 
ment being going through the door into the black one, etc. There- 
fore, the difference in learning during the non-shock trials must have 
been a function of the previous training, and more specifically a func- 
tion of the strength of the primary drive involved in that training. 

The second experiment which serves as a control demonstrated 
that if the non-shock trials were continued long enough, the new 
habit of pressing the bar and the older response of running through 
the door would both eventually extinguish (11). ‘Thus, in this situ- 
ation the primary drive of pain is essential not only to the establish- 
ment of the acquired drive, but also to its maintenance. 

In the present experiment, when the animals were dropped into the 
white compartment on the non-shock trials following their training 
with shock, they exhibited urination, defecation, tenseness, and other 
forms of behavior which are ordinarily considered to be symptoms of 
fear. Furthermore, the procedure of having been given a number of 
moderately painful shocks in this compartment would be expected 
to produce fear. ‘Therefore, it seems reasonable to conclude that the 
acquirable drive motivating the learning of the new response of turn- 
ing the wheel was fear and that a reduction in the strength of this 
fear was the reinforcing agent. ‘Thus, this experiment confirms 
\lowrer’s (14) hypothesis that fear (or anxiety) can play a role in 
learning similar to that of a primary drive such as hunger. 

In terms of the hypothesis put forward in Miller and Dollard (12) 
the cues in the white compartment acquire their drive value by 
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acquiring the capacity to elicit an internal response which produce 
a strong stimulus. Whether this strong stimulus is produced by 
peripheral responses, such as those involved in the blanching of the 
stomach and the tendency for hair to stand on end, or by central im- 
pulses which travel from the thalamus to sensory areas of the cortex 
is a matter of anatomical rather than functional significance. Fear 
may be called a stimulus-producing response if it shows the func- 
tional characteristics of such responses, in brief, obeys the laws of 
learning and serves as a cue to elicit learned responses such as the 
verbal report of fear. 

‘The general pattern of the fear response and its capacity to pro- 
duce a strong stimulus is determined by the innate structure of the 
animal. ‘lhe connection between the pain and the fear is also pre- 
sumably innate. But the connection between the cues in the white 
compartment and the fear was learned. ‘Therefore the fear of the 
white compartment may be called an acquired drive. Because fear 
can be learned, it may be called acquirable; because it can motivate 
new learning, it may be called a drive. 

Running through the door and into the black compartment re- 
moved the animal from the cues in the white compartment which 
were eliciting the fear and thus produced a reduction in the strength 
of the fear response and the stimuli which it produced. This re- 
duction in the strength of the intense fear stimuli is presumably what 
gave the black compartment its acquired reinforcing value. 

If the reduction in fear produced by running from the white int 
the black was the reinforcement for learning the new habit of whee! 
turning, we would expect this habit to show experimental extinction 
when that reinforcement was removed. This is exactly what hap- 
pened. During the first trial on which turning the wheel no longer 
dropped the door, the animals gradually stopped performing this 
response and began to exhibit other responses. As would be ex- 
pected, the one of these responses, pressing the bar, which caused 
the door to drop and allowed the animal to remove himself from thie 
fear-producing cues in the white compartment, was gradually learned 
in a series of trials during which the wheel turning was progressively 
crowded out. Thus, it can be seen that the escape from the white 
compartment, which presumably produced a reduction in the strength 
of the fear, played a crucial role, similar to that of a primary reward, 
in the learning and maintenance of the new habits. 

Some of the implications of the principles which this experiment 
has demonstrated should be mentioned briefly. It can be seen that 
being able to learn a response (fear of the white compartment 
which in turn is able to motivate the learning and performance of a 
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whole category of new responses (turning the wheel, pressing the bar, 
ind any other means of escape from the white compartment) greatly 
‘increases the flexibility of learned behavior as a means of adapting to 
a changing environment. 

The present experiment has demonstrated the drive function of 
fear aS a response which presumably produces a strong stimulus. 
ut if fear is a strong response-produced stimulus, it will be expected 
to function, not only as a drive, but also as a cue mediating secondary 
generalization. ‘Thus, when fear is learned as a new response to a 
given situation, all of the habits which have been learned elsewhere in 
response to fear, as well as the innate responses to fear, should tend 
to be transferred to that new situation. Evidence supporting this 
deduction has been secured in a recent experiment by May (7). 

It seems possible that the potentialities of response-produced 
stimuli as mediators of secondary generalization and sources of ac- 
guirable drive may account in stimulus-response, law-of-effect terms 
for the type of behavior which has been described as ‘expectancy’ and 
considered to be an exception to this type of explanation. If it should 
turn out that all of the phenomena of expectancy can be explained 
on the basis of the drive and cue functions of response-produced 
stimuli, expectancy will of course not vanish; it will be established as 
a secondary principle derivable from more primary ones. 

The mechanism of acquired.drives allows behavior to be more 
adaptive in complex variable situations. It also allows behavior to 
appear more baffling and apparently lawless to any investigator who 
has not had the opportunity to observe the conditions under which 
the acquired drive was established. In the present experiment the 
learning and performance of the responses of turning the wheel and 
pressing the bar are readily understandable. An E dealing with 
many rats, a few of which without his knowledge had been shocked in 
the white compartment, might be puzzled by the fact that these few 
rats became so preoccupied with turning the wheel or pressing the 
bar. In the present experiment, the white and black compartments 
are very obvious features of the animal’s environment. If more ob- 
scure external cues or internal ones had been involved, the habits 
{ turning the wheel and pressing the bar might seem to be completely 
bizarre and maladaptive. One hypothesis is that neurotic symptoms, 
such as compulsions, are habits which are motivated by fear (or 
anxiety as it is called when its source is vague or obscured by re 
pression) and reinforced by a reduction in fear. ® 


* The author’s views on this matter have been materially strengthened and sharpened by 
eeing the way in which Dollard (4), working with symptoms of war neuroses, has independently 
come to a similar hypothesis and been able to apply it convincingly to the concrete details of the 

ise material. 
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SUMMARY 


Albino rats were placed in a simple apparatus consisting of ty 
compartments separated by a door. One was white with a grid as a 
floor; the other was black without a grid. Before training, the animals 
showed no marked preference for either compartment. Then they 
were placed in the white compartment, received an electric sliock 
from the grid, and escaped into the black compartment through the 
open door. After a number of such trials, the animals would run 
out of the white compartment even if no shock was on the grid. 

‘lo demonstrate that an acquired drive (fear or anxiety) had been 
established, the animals were taught a new habit without further shocks. 
The door (previously always open) was closed. The only way that 
the door could be opened was by rotating a little wheel, which was 
above the door, a fraction of a turn. Under these conditions, the 
animals exhibited trial-and-error behavior and gradually learned to 
escape from the white compartment by rotating the wheel. 

If conditions were changed so that only pressing a bar would 
open the door, wheel turning extinguished, and a second new habit 
(bar pressing) was learned. 

Control experiments demonstrated that the learning of the new 
habits was dependent upon having received moderately strong electric 
shocks during the first stages of training. 

The following hypotheses were discussed: that responses which 
produce strong stimuli are the basis for acquired drives; that such 
responses may be the basis for certain of the phenomena of learning 
which have been labeled ‘expectancy,’ thus reducing this from the 
status of a primary to a secondary principle and that neurotic symp- 
toms, such as compulsions, may be motivated by anxiety and rein- 
forced by anxiety-reduction like the two new responses learned in 
this experiment. 


(Manuscript received for immediate publication December 15, 1947) 
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DISCUSSION 


THE EXPERIMENTS BY SPENCE AND LIPPITT AND 
BY KENDLER ON THE SIGN-GESTALT 
THEORY OF LEARNING 


BY ROBERT W. LEEPER 


University of Oregon 


The three experiments herein considered are interpreted by their authors 
as disproving the idea that learning can be explained as a product of per- 
ceptual processes, and as showing, instead, that the essential factor in 
learning is reinforcement. ‘This discussion, in contrast, will present the 
view that these experiments have a real theoretical significance, but of a 
different sort. 

It is of course true that some psychologists do not care much as to 
whether a study has any theoretical significance. They are inclined to 
agree with workers outside the field of psychology, who generally say that 
the trouble with psychology is ‘too much theorizing, too little factual evi- 
dence.’ But on the other hand, most of those who are familiar with the 
technical work of psychology know that psychology has now achieved a 
considerable maturity in factual material and fact-gathering techniques, 
but has failed signally to make corresponding progress in utilizing theoret- 
ically its available knowledge. We know even that, if our factual material 
is deficient in psychology, this often can be traced to our ineptness in our 
theoretical work, because experimentation is like exploring for oil—it rarely 
pays just to sink a hole at random. 

This unequal development of facts and theories is seen in the field of 
learning at least as truly as in most other fields of psychology, despite the 
enormous investment which psychologists have made in this field. Con- 
sequently, all of us can acknowledge our gratitude to Spence and his students 
for their attempts theoretically to define, and experimentally to decide, the 
most important questions in the field of learning. The articles discussed 
herein, therefore, are ones that deserve careful discussion. 

The experiments in question were inspired by a criticism of the law of 
reinforcement (or effect) which came particularly from a pair of experiments 
by Clark Hull (1) and myself (4), and from an interpretation of these ex- 
periments which I offered in terms of Tolman’s interpretation of learning. 
One reason why this pair of experiments gave a challenge which Hull and 
Spence needed to consider is that the acid test of psychological theories is 
their pragmatic value, first, in permitting predictions in situations where 
the person predicting does not yet know the outcome of the situation (and 
hence really has to ‘predict’) and, second, in guiding the construction o! 
situations suited to secure some desired result. This latter function was 


102 


ae 1 


2. © 





COMMENT ON EXPERIMENTS OF SPENCE, LIPPITT, AND KENDLER — 103 


‘nvolved in these experiments. Both Hull and myself felt that it could be 
jemonstrated that rats could learn to follow one path in a maze when hungry 
and a second path when thirsty, and we set out (independently) to con- 
struct situations that would demonstrate this phenomenon. Using his S-R 
reinforcement concepts, Hull planned one experimental situation. His rats 
mastered the problem, but 25 eight-day periods of training were required 
before the rats reached 80 percent correct on the first trials of each day. 
Using Tolman’s and Lashley’s concepts as a guide, I planned a different ex- 
perimental situation to accomplish the same result. The rats reached this 
same level of 80 percent accuracy after merely one eight-day period of train- 
ing, and after two such periods reached a 94 percent accuracy on first trials, 
which Hull’s rats never attained. In still another maze, adding some latent- 
learning conditions which, according to an S-R reinforcement theory, should 
not have helped the learning, I had another group reach 100 percent accuracy 
after merely four days of training. Control tests showed that, with Hull’s 
maze, I could get his results. 

To account for these extreme differences, I suggested that the explanation 
could not be found in factors operating on correct runs, since the situation in 
both Hull’s and my mazes on correct runs involved the same series of events 

choice of route, the run, and prompt reinforcement). Neither Hull nor 
Spence has raised objection to this suggestion. But, I said, the situation 
in the two studies on incorrect runs was fundamentally different. In Hull’s 
maze, a wrong run brought the rat to a door blocking its entrance into the 
end-box. In consequence, I said, a rat in that maze, on wrong runs, should 
learn something which would hamper its performance on a following day when 
its motivation would be shifted (from hunger to thirst, or vice versa). In my 
mazes, on the other hand, separate end-boxes were used, and a wrong run 
led the rat into an end-box containing the goal-material for which it was not 
motivated on that day, but for which it might be motivated on the succeed- 
ing day. I said that, even though this situation was ‘frustrating,’ it should 
have caused the rat to learn something that would help it to change its behavior 
on a following day when its motivation was changed. If such an interpretation 
is correct, it means, of course, not merely that ‘habits’ have more the nature 
of ‘knowledge of what leads to what’ than the nature of S-R connections, 
but also that perceptual processes, rather than reinforcement, are the crucial 
factor in causing learning. 

To test this interpretation, Spence and Lippitt, and later Kendler, pro- 
ceeded on the same principle that the acid test of a theoretical formulation 
is, in part, its ability to guide the construction of a situation which would 
attain some specified result. Accordingly, they used somewhat new designs 
of maze and somewhat new programs of training, but ones which corre- 
sponded, as far as they could tell, with the specifications of the sign-gestalt 
theory of learning with regard to the essential conditions operating in my 
experiment. Instead of training the rats with alternated hunger and thirst, 
for example, all three experiments used one uniform motivational condition 
curing the original training, and then gave later critical tests under a dif- 
ferent motivational condition to see whether there had been some ‘latent 
learning’ of the location of, or of the means of reaching, some goal-material 
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for which the rats had not been motivated during the original learning. 
‘Thus, Spence and Lippitt’s rats originally were thirsty, and the later critica! 
trials were given under hunger (5). In the original trials the second end- 
box had contained food for the experimental group, but had been empty 
with the control group. They found that, with the motivation shifted to 
hunger, none of the rats went to the food location on the first run, and that 
the experimental group learned no more rapidly than the control group to 
run to the now-correct side. In one of his studies (3), Kendler virtually 
repeated this situation of Spence and Lippitt’s experimental group, but 
with the two routes more distinct visually and with hunger rather than thirst 
used with some of the rats in the original training. In the two critical trials 
in Which the motivation was shifted from that used originally, none of the 
rats on either trial went to the now-correct side. In the other study, re- 
ported in the preceding issue of this journal (2), Kendler’s control group was 
both hungry and thirsty originally and found food in one end-box and water 
in the other. ‘The experimental group had the same external situation, but 
was satiated for both goal-materials. In the critical trials, made hungry 
on two trials and thirsty on two trials, the control group made errors only on 
15 percent of the trials, the experimental group on 33 percent of the trials. 

Their groups were rather small, but the behavior was so uniform that 
there is no reason to doubt their data. But what do the data mean? 

The authors interpret their findings as proof that (1) the sign-gestalt in- 
terpretation of learning, at least as it has been stated, is inadequate, and (2) 
reinforcement is indispensable for learning. 

Let me propose a different interpretation, agreeing with their first con- 
clusion but disagreeing with their second. How can this be? The answer 
is indicated, I believe, in a letter I have had from Kendler, ‘““The important 
thing is to state the hypothesized factors so that a first-year graduate student 
can manipulate the concepts experimentally.” That is, the specifications 
given by a theory must be sufficiently adequate that they can be applied 
practically by any psychologist who tries sincerely to understand and use the 
theory in question. These experiments by Kendler and by Spence and 
Lippitt are a sufficient proof, it seems to me, that the sign-gestalt interpre- 
tation is inadequate in this sense. For instance, when I go carefully through 
my own article, I cannot find any specification of the conditions required for 
learning except this type of statement: “. . . the rate of mastery of this 
problem can be either very rapid or very slow, depending on how clearly the 
different routes and end-boxes are distinguished from one another, and de- 
pending on whether, on their incorrect runs, the rats are or are not allowed 
to enter the end-box containing the goal-material not desired at the time’’ (4, Pp. 
37; italics added). The experiments now being reviewed employed a situa- 
tion which comes within the limits of this statement. Their results prove 
that this specification is inadequate. 

But, do their results then prove that learning requires, not merely certain 
perceptual conditions, but also reinforcement? The answer is that this is 
quite a different matter. Their experiments would support such a con- 
clusion only if, as between their experimental and control groups, this was 
the only condition which was varied. This was not the case. When they 
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ymitted ‘motivation and reinforcement’ they also omitted a number of other 
factors such as certainty of adequate stimulation, close temporal contiguity 
f the several perceptual processes, and set. 

Let us look more closely at their set-ups. In Spence and Lippitt’s ex- 
neriment, in which the rats were thirsty originally, the water was given by a 
tube protruding through one wall of one end-box, the dry food in a hole in 
the floor in the other end-box. Such an arrangement was almost certain 

insure that the rat, on entering the wrong end-box, would respond only 
tardily and incidentally, if at all, to the dry food. For, in such a situation, 
t is certain that a thirsty rat would respond first by surveying the wall in 
earch of the water nozzle and then by making vigorous escape efforts. 
These experimenters cannot claim, therefore, that they provided the con- 
jition of ‘contiguity in experience’ or ‘close time-relationships’ which Spence 
himself believes is an indispensable condition for learning. 

To these comments, I realize that this reply may be made, “What you 
iy may be true in part, but reinforcement may be essential to get these 
nditions of set, prompt response, etc., of which you speak.” ‘To evaluate 

this reply, I prefer to speak experimentally. In my original study there 
really is a close parallel to Spence and Lippitt’s and Kendler’s investigations, 
but with one difference. The food and water, rather than being presented 
in distinctive ways, were presented in similar pans in comparable spots in 
the two end-boxes. Consequently, on a wrong run, the situation was such 
is to guarantee that, immediately after making the run, the rat would approach 
‘he pan containing the undesired goal-material and perceive discriminatingly 
vhat 1t contained. The rats were given four or five days of training under 
hunger at the start of their work, with this arrangement, before their first 
shift to thirst. On the first trial after that shift, 13 of the 23 rats went to 
the now-correct side. None of Spence and Lippitt’s or Kendler’s rats thus 
shifted. On the first trial of the next day, 19 of the 23 went to this now- 
correct side. Only two of Spence and Lippitt’s 1o rats were thus correct 
in their comparable trial. 

The studies by Spence and Lippitt and by Kendler, then, are funda- 
mentally inconclusive except on one point: They prove fairly well that, even 
if a perceptual or organizational theory of learning may ultimately prove to 
be sound, this sort of theory has not been stated yet in a sufficient explicit 
and detailed fashion. Perhaps we may regret that their studies did not also 
make some constructive contribution, rather than merely this negative point. 
But their studies are important anyway. The demonstration of such in- 
adequacies in the statement of theories is one of the necessary means by 
which psychology develops. 


(Manuscript received July 21, 1947 
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THE SPECULATIONS OF LEEPER WITH RESPECT TO 
THE IOWA TESTS OF THE SIGN-GESTALT 
THEORY OF LEARNING 


BY KENNETH W. SPENCE 
University of lowa 
AND 
HOWARD H. KENDLER 
University of Colorado 


Dr. Leeper’s flat admission that the sign-gestalt interpretation of learning 
is an inadequately formulated theory will undoubtedly come as a surprise 
to some of its other supporters. While we have been inclined many times 
to call attention to the programmatic character of much of Tolman’s 
theorizing, we were of the opinion that he and some of his supporters had 
been sufficiently specific in the case of simple trial and error learning to 
attempt an experimental test. As a matter of fact our experimental design 
was arrived at by translating into a concrete situation the theoretical 
picture of this type of learning that Tolman gave in his chapter in Moss’s 
Comparative psychology (6, p. 396). Indeed, the notion of using a Y-shaped 
maze was suggested by the fan-shaped schematization of his theory as there 
presented (see 6, Fig. 65). 

While we were fully appreciative of the difficulties of translating Tol- 
man’s theory into an experiment (see discussion in Spence and Lippitt), 
we felt, nevertheless, that we had one very distinct advantage. It happened 
that we had available at Iowa a number of psychologists whose theoretical 
position was quite similar to, if not identical with thac of Tolman: Dr. 
Ralph White, the late Professor Kurt Lewin and Dr. Lippitt, then a graduate 
assistant of Lewin’s, were all very familiar with Tolman’s learning theory 
and the problems it presented. As White (7) has pointed out, the position 
of the Lewinian psychologists on the problem of learning is essentially the 
same as thatof Tolman. All of these individuals discussed the experiments 
at one time or another and it is difficult to believe that they were not able 
to appreciate or were unaware of the problems in perception that the ex- 
perimental testing of the theory involved. As to the design of the ex- 
periment, White’s enthusiasm for it is revealed by the following quotation, 
in which he begins his description of it (7, p. 159): 


To make more concrete this crucial distinction between S-R psychology, on the one 
side, and Tolman and Lewin and ‘common sense’ on the other, we will begin by describing a 
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crucial experiment, in which the two sets of postulates lead to diametrically opposite pre- 
dictions. It is the sort of experiment the value of which has been repeatedly emphasized 
by Hull in particular—the sort of experiment that shifts a basic theoretical issue from the 
level of endless verbal disputation onto the level of solid experimental fact. 


Space does not permit a detailed presentation of Tolman’s theory of 
trial-and-error learning here. In effect he assumes that a principle of as- 
sociation by contiguity is sufficient to acquire ‘cognitions’ as to “what 
stimuli lead to what subsequent stimuli and by which behaviors.” He 
suggests that the number (frequency) of the sequences in experience of 
stimulus (sign)-behavior-number (significate), etc., will determine its effect- 
iveness in determining subsequent behavior (6). He mentions briefly a 
‘law of belongingness’ and then comes to a ‘law of motivation’—to wit, “If 
some one of the resultant f’s (significants) should be especially satisfying 
the learning of that particular sign-gestalt, or even the whole set of alter- 
native sign-gestalts, would probably thereby be facilitated” (6, p. 399). 
Apparently recognizing that this formulation would essentially represent 
a reinforcement hypothesis applied to ‘cognitions,’ and mindful of his own 
jatent learning studies and their apparent inconsistency with such an as- 
sumption, Tolman immediately rejected itin a footnote. In his later writings 
he has likewise not been favorable to such a reinforcement (satisfying effect) 
assumption. It was this last suggested ‘law of motivation’ that provided 
the issue which led to the institution of the present series of experiments 
(2, 3, 5). 

According to Tolman’s theory the subjects of Spence and Lippitt should 
have acquired in their training period the cognition that the signs at the 
entrance io the left alley led to food. Motivated for food and possessed of 
this ‘cognition’ they should now have entered the left alley. None of them 
did. Kendler’s repetition of the study, with some variations, gave the same 
result. Spence and Lippitt concluded (5, p. 499): 


The result of our experiment would seem to cast serious doubt upon Tolman’s specifi- 
cation of the variables determining the acquisition of ‘cognitions,’ and suggests that he needs 


to modify it in some manner. There 1s no necessity, of course, that Tolman abandon his anti- 
reinforcement position. ... 


Kendler similarly concluded that the sign-gestalt theory as it now stands 
was not confirmed. 

It will be noted that we did not interpret, as Leeper states in his comment, 
our findings as proof that “reinforcement is indispensable for learning”’ 
(4, p. 104). As the above quotation shows, we specifically indicated that 
Tolman did not necessarily have to abandon his anti-reinforcement position 
and we suggested a possible way for Tolman to specify his cognitions in the 
present experiment which would permit him to keep to his non-reinforcement 
position. The main conclusion that we did reach was that Tolman’s theory 
did not predict the obtained result and needed some kind of modification. 
However, the findings certainly do not lend any comfort to a non-reinforce- 
ment theorist, and one of the possible modifications that suggests itself is to 
assume that learning, cognition formation, habit formation, or what you will, 
is a function of the presence of a reinforcing (goal) situation. 
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Leeper rejects this latter interpretation because of certain supposed 
deficiencies in the experimental conditions which precluded the acquisition of 
the cognition with respect to food. He writes, ‘When they omitted ‘moti- 
vation and reinforcement’ they also omitted a number of other factors such 
as certainty of adequate stimulation, close temporal contiguity of the severa| 
processes, and set” (4, p. 104). He implies that we could scarcely have 
arranged poorer conditions for the prompt perception of the food. Leeper 
is here grasping at straws, for surely he must know that our Lewinian col- 
leagues, if not we ourselves, were just as appreciative as he of the importance 
of favorable conditions for perception. In not one of the studies was the 
food, as he implies, ‘in a hole in the floor’ and hence not readily perceivable. 
In the Spence and Lippitt experiment the food was on a tray which was at 
the level of the floor. The food, large Purina dog chow pellets, was above 
the floor level. Furthermore, the tray was placed in a position directly 
comparable to the position of the water spout in the other end-box so that 
if the thirsty animal went to the locus of the water spout it was certain to 
encounter the food. As was stated in the procedure section of the article, 
in most instances an animal either smelled the food or moved it with some 
part of its body. In the Kendler studies the food pellets (six) were spread 
out over the whole end-box, with one pellet always being in the same relative 
location as the nozzle of the water bottle. Unless the animal kept its eyes 
firmly shut it could not help ‘perceiving’ them and the probability of not 
stepping on them was practically zero. Furthermore, if the rats were 
anosmic they were wasting a lot of energy in futile sniffing. 

With regard to the question of close temporal contiguity of the several 
processes, an analysis of the time records in the Spence-Lippitt study did 
show that the time taken to reach the food box on the forced trials tended 
to become longer each day until the eighth day. In contrast the time to the 
water side decreased each day. So far as acquiring the two cognitions was 
concerned, then, conditions were more favorable for the acquisition of the 
‘water cognition,’ but it would be somewhat difficult to claim here that no 
learning took place with respect to food up to the time at which the time 
curves begin to separate, whereas it did with respect to water. Obviously, 
to do so would admit a differential factor in learning—reinforcement. This 
learning, if it did take place, should have revealed itself to some extent, at 
least, in the faster learning in the post test problem of those Ss which had 
found food in the left box as compared with a control group which did not. 
Apparently the significance of this finding was not appreciated by Leeper. 

Whether or not our experimental set-ups provide the condition of ‘con- 
tiguity in experience’ which would meet the specifications of Tolman’s 
theory is, of course, a moot question. Leeper thinks we have not. We do 
insist, however, that we were not unaware of the problems he raises and that 
prior to the conduct of the experiment a number of theorists who held 
essentially to the same learning theory as Tolman’s regarded the experi- 
ment as satisfactory in this respect. More convincing still, we feel sure, 
is the fact that in the next experiment to be reported in the series our Ss did 
acquire, im this same apparatus, cognitions with respect to the locus of food 
and water when they were not motivated forthem. Lest this raise the hopes 
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of the non-reinforcement theorist we hasten to add that some other moti- 
yation and reward condition was deliberately introduced into this new situa- 
tion. 

As to the contradictory experimental results that Leeper reports, there 
are a number of other possible explanations of the difference between his 
results and ours which appear to be much more likely than any differences 
in the perceptual clearness of the situations. For one thing Leeper per- 
mitted his Ss to correct during the first period of training under hunger. 
This means that they received reinforcement within a short time after going 
to the water alley. Still another factor which could account for the differ- 
ence in results is the fact that the goal object containers were similar. The 
effect of this procedure would be to provide a potent secondary reinforcing 
agent in the water box of his experiment. Denny (1) has recently shown 
how failure to control secondary reinforcement has misled experimenters 
in this field. A final possibility is that Leeper’s Ss had not yet learned too 
well to go to the food side and when the drive cues were changed for the first 
time they reverted to a chance score. Obviously it is easy to make such 
speculative criticisms. What is needed is more experimental work and less 
such purely critical speculation. 

Finally, Leeper’s lament that our studies made no constructive contri- 
bution but merely indicated the inadequacy of Tolman’s theoretical for- 
mulation struck us as not a little puzzling. We frankly admit that the pri- 
mary purpose of our experiments was to test this theory. It is our belief 
that this is what one is supposed to do about theories. If the result in this 
instance was negative the blame can harldy be placed on us. Must results 
be favorable to a theory to constitute a ‘constructive contribution’? 


(Manuscript received October 29, 1947) 
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